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PREFACE TO THE SECOND EDITION 


In the second edition the main plan of the book has been left unchanged. 
Small amounts of material have been added in a great number of places, 
and improvements have been attempted at many points. It was felt that 
a lack in the first edition was its omission of the theory of Laplace and 
Fourier transforms. This has heen remedied in Chapter 8. The discus- 
sion of numerical calculations, integral equations and group theory has 
likewise -been augmented by removal of unnecessary items and some 
replacements. 

Ambiguities, errors and pedagogical faults have been sought in an 
endeavor to eliminate them. If we have partly succeeded in this task, 
we owe it to a host of readers and our students who have given us the 
benefit of their advice and’ criticism. In this respect we are particularly 
grateful to many scientists at the Navy Electronics Laboratory who pre- 
pared a detailed list of errors soon after the first edition of the book ap- 
peared. A similar and very useful list of errata was sent by Professor 
Pentti Salomaa of the University of Turku, Finland, to whom we express 
our indebtedness. Dr. M. H. Greenblatt of R.C.A. suggested an im- 
provement in Chapter 12 of which we have madeuse. Finally, weacknowl- 
edge stimulus and aid coming from the careful work of Professors Tsugihiko 
Sato and Makoto Kuminune of Japan who, in translating the first edition, 
discovered a number of inaccuracies which have now been corrected. 


H. M. 
G. M. M. 


New Haven, Conn. 
November, 1955 


PREFACE 


The authors’ aim has been to present, between the covers of a single 
book, those parts of mathematics which form the tools of the modern 
worker in theoretical physics and chemistry. They have endeavored to 
do this by steering a middle course between the mere recording of facts and 
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formulas which is typical of handbook: treatments, and the ponderous 
development which characterizes treatises in special fields. Therefore, 
as far as space permitted, all results have been embedded in the logical 
texture of proofs. Occasionally, when full demonstrations are lengthy or 
not particularly illuminating with respect to the subject at hand, they 
have been omitted in favor of references to the literature. Except for the 
first chapter, which is primarily a survey, proofs have always been given 
where omission would destroy the continuity. of treatment. 

Arbitrary selection of topics has been necessary for lack of space. This 
was based partly on the authors’ opinions as to the relevance of various 
subjects, partly on the results of consultations with colleagues. The 
degree of difficulty of the treatment is such that a Senior majoring in physics 
or chemistry would be able to read most parts of the book with under- 
standing. 

While inclusion of large collections of routine problems did not seem 
conformable to the purpose of the book, the authors have felt that its 
usefulness might be augmented by two minor pedagogical devices: the 
insertion here and there of fully worked examples illustrative of the theory 
under discussion, and the dispersal, throughout the book, of special prob- 
lems confirming, and in some cases supplementing, the ideas of the text. 
Answers to the problems are usually given. 

The degree of rigor to which we have aspired i is that customary in 
eareful scientific demonstrations, not the lofty heights accessible to the 
pure mathematician. For this we make no apology; if the history of the 
exact sciences teaches anything it is that emphasis on extreme rigor often 
engenders sterility, and that the successful pioneer depends more on 
brilliant hunches than on the results of existence theorems. We trust, of 
course, that our effort to avoid rigor mortis has not brought us danger- 
ously close to the opposite extreme of sloppy reasoning. 

A careful attempt has been made to insure continuity of presentation 
within each chapter, andas-far as possible throughout the book. The 
diversity of the subjects has made it necessary to refer occasionally to 
chapters ahead. Whenever this occurs it is done reluctantly and in order 
to avoid repetition. 

As to form, considerations of literacy have often been given secondary 
rank in favor of conciseness and brevity, and no great attempt has been 
made to disguise individual authorship by artificially uniformising the 
style. 

The authors have used the material of several of the chapters in a num- 
ber of special courses and have found its collection into a single volume 
convenient. To venture a few specific suggestions, the book, if it were 
judged favorably by mathematicians, would serve as a foundation for 
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courses in applied mathematics on the senior and first year graduate level. 
A thorough introductory course in quantum mechanics could be based on 
chapter 2, parts of 3, 8 and 10, and chapter 11. Chapters 1, 10 and parts 
of 11 may be used in a short course which reviews thermodynamics and 
then treats statistical mechanics. Reading of chapters 4, 9, and 15 would 
prepare for an understanding of special treatments dealing with polyatomic 
molecules, and the liquid and solid state. Since ability to handle numeri- 
cal computations is very important in all branches of physics and chemistry, 
a chapter designed to familiarize the reader with all tools likely to be needed 
in such work has been included. 

The index has been made sufficiently complete so that the book can 
serve as a ready reference to definitions, theorems and proofs. Graduate 
students and scientists whose memory of specific mathematical details is 
dimmed may find it useful in review. Last, but not least, the authors 
have had in mind the adventurous student of physics and chemistry who 
wishes to improve his mathematical knowledge through self-study. 


Henry MARGENAU 
Grorce M. MURPHY 


New Haven, Conn. 
March, 1943 
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CHAPTER 1 
THE MATHEMATICS OF THERMODYNAMICS 


Most of the chapters of this book endeavor to treat some single mathe- 
matical method in a systematic manner. The subject of thermodynamics, 
being highly empirical and synoptic in its contents, does not contain a very 
uniform method of anaiysis. Nevertheless, it involves mathematical 
elements of considerable interest, chiefly centered ‘about partial differentia- 
tion, Rather than omit these entirely from consideration, it seemed well 
to devote the present chapter to them. Of necessity, the treatment is 
perhaps less systematic than elsewhere. It is placed at the beginning 
because most readers are likely to have some familiarity with the subject 
and because the mathematical methods are simple. (A reading of the first 
chapter is not essential for an understanding of the remainder of the book.) 

1.1. Introduction.—The science of thermodynamics is concerned with 
the laws that govern the transformations of energy of one kind into another 
during physical or chemical changes. These changes are assumed to occur 
within a thermodynamic system which is completely isolated from its sur- 
roundings. Such a system is described by means of thermodynamic variables 
which are of two kinds. Extensive variables are proportional to the amount 
of matter which is being considered; typical examples are the volume or the 
total energy of the system. Variables which are independent of the amount 
of matter present, such as pressure or temperature, are called intensive 
variables. 

It is found experimentally that it is not possible to change all of these 
variables independently, for if certain ones of them are held constant, the 
remaining ones are automatically fixed in value. Mathematically, such a 
situation is treated by the method of partial differentiation. Furthermore, 
a certain type of differential, called the exact differential and an integral, 
known as the line integral are of great importance in the study of thermo- 
dynamics. We propose to describe these matters in a general way and to 
apply them to a few specific problems. We assume that the reader is 
familiar with the general ideas of thermodynamies and refer him to other 
sources? for a more complete treatment of the physical details. 


1 A representative set of references on thermodynamics will be found at the end of 
this chapter. Although not easy to read, serious students of the subject should be 
familiar with the work of J. Willard Gibbs, Transactions of the Conn. Acad., 1875-1878; 
“ Collected Works,” Vol. I, Longmans, Green and Co., New York, 1928; ‘A Com- 
mentary on the Scientific Writings of J. W. Gibbs,” 2 vols., Yale University Press, 
New Haven, 1937. 
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1.2. Differentiation of Functions of Several Independent Variables.—If 
z is  single-valued function of two real, independent variables, x and y, 


z= f(x,y) 


z is said to be an erplicit function of x and y. The relation between the 
three variables may be represented by plotting z, y and z along the axes of a 
Cartesian coordinate system, the result being a surface. If we wish to 
study the motion of some point (x,y) over the surface, there are three 
possible cases: (a) z varies and y remains constant; (b) y varies, z remain- 
ing constant; (e) both z and y vary simultaneously. 

In the first and second cases, the path of the point will be along the 
curves produced when planes, parallel to the XZ- or YZ-coordinate planes, 
intersect the original surface. If x is increased by the small quantity Az 
and y remains constant, z changes from f(x,y) to f(z + Az,y), and the 
partial derivative of z with respect to x at the point (x,y) is defined by 


. f(z + Ary) — fy) 


fe(z,y) = bm’ At 


The following alternative notations are often used 


flay) = een) = (2) -(2 (1-1) 


where the constancy of y is indicated by the subscript. Since both z and y 
are completely independent, the partial derivative is evaluated by the 
usual method for the differentiation of a function of a single variable, y 
being treated as a constant. 

Defining the partial derivative of z with respect to y (2 remaining con- 
stant) in a similar way, we may write 


futon) = afew) = (F) - (2) (1-2) 


If z is a function of more than two variables 


z = f (21,29, ** En) 
the simple geometric interpretation is lacking, but such a symbol as: 


Ga) 
O21 asm, + stn 


still means that the function is to be differentiated with respect to xı by 
the usual rules, all other variables being considered as constants. 

Since the partial derivatives are themselves functions of the independent 
variables, they may be differentiated again to give second and higher 
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derivatives 


(1-3) 


It is not always true that f,, = fyz; but the order of differentiation 7s 
immaterial if the function and its derivatives are continuous. Since this is 
usually the rase in physical applications, quantities such as fay, fyz or 
Sey, Jeuz: fyzz Will be considered identical in the present treatment. 

_ 1.3. Total Differentials.—tIn the third case of sec. 1.2, both z and y vary 
simultaneously or, in geometric language, the point moves along a curve 
determined by the intersection with z = f(x,y) of a surface which is neither 
parallel with the XZ- nor YZ- coordinate plane. Since x and y are inde- 
pendent, both Ar and Ay approach zero~4s Az approaches zero. In that 
case the change in z caused by increments Ar and Ay, called the total 
differential of z, is given by 


dz dz 
dz = | —} d —j]d 1—4 
7 (52) eG) (1-4) 
If it happens that z and y depend on a single independent variable u (it 


might be the arc length of the curve along which the point moves, or the 
time), 


z = f(xy); c= F(u); y = F(u) 


dz dz\ dr ðz\ dy , 
Z a- (Zy —)\ = l- 
du (=) du + (z „du (-5) 


For the special case, 


then, from (4) 


z = f(z); xz =F(y); y independent 
dz ôz\ dz oz 
— = (—)— — 1-6 
dy 2) dy + a) (1-8) 


An important generalization of these results arises when z, y,- + - are not 
independent variables but are each functions of a finite number of independ- 
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ent variables, u, v, ° 
J = EREA T ‘) 
t= Fy(u,v,w, C -) 
y = Flup, w, 1 -) 


Ce ee ee | 


G ð 
a- (2) aut (2) do +- (1-7) 


Then, from (4) 


and from (5) 
e a Gha 
A a Gaa an U 09 
with similar expressions for (@f/dv), (f/dw),---. When these are put into 
(7) we obtain 
_ | 4 , fey Ja |z% of by | 
ap = | tya t u + Ba dv + dy ov d +--> 
ð ð 
-[# a+ =, do + JEDE du + adv + [Ze ++ (1-9) 
ox ðu 
Since u, v, +++ are independent variables, we may write 


ox ox 
=at tpt 


9 ; (1-10) 

= oY cy wae 

dy = u + ap Z + 

Comparing coefficients in (9) and (10), we finally obtain 
g-a T y+: (1-11) 


The difference between (7) and (11) should be noted: in the former equa- 
tion the partial derivatives are taken with respect to the independent va- 
riables, while in the latter, with respect to the dependent variables. ‘The im- 
portant conclusion may thus be drawn that the total differential may be 
written either in the form (7) or (11); that is, df may be composed addi- 
ð 

tively of terms x dz, : -+ regardless of whether x is a dependent or an 
independent variable. 
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1.4. Higher Order Differentials — Differentials of the second, third and 
higher orders are defined by 


af =d(df); PJ =T); ---; dy = EN) 


If there are two variables z and y, we obtain from (4) 


Pf = dap) = a(2) a+ (Baao + a(Dav + (2 2) aay) 


However, 


CA: -Ël as ca 
d (2) - ax az 2) ae + ay 5 (2) a + Seay 


with a similar expression for d (4 , hence 


3 zay = 


dèy = 55 (ls) + EE anay + £4 Gay) + Lerta 


ay 


If z and y are independent variables, d?x = d's = - - - d'z =---d"y =0, 
and the n-th order differential becomes 


d'f = a da” + C) oF dedy +--+ (") OT daray 


1) az" tay as” Fay" 
Ff a oF 
oo tna dedy ™ + — dy” 1-12 


where the (") are the binomial coefficients, (3) = (, ” D = n!/ki(n—k)! 
(Cf. sec. 12.2.) 


Example. Calculate dp and dp for a gas obeying van der Waals’ 
equation: 


RT a 
Pyp V3 
(2 __R , (2 RT +% 
aT) V—B’ NV]  (V—8)? 


(=) 0: (#2) __2RT_ _ ôe 
aT? jy ° VY?]r -ep V4 


a 2 (2 __ B -3(2) 
av NT) w- aT NOV 
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2 RT 
dp ropit, e- ler 


(V — 8) (V — 8) 
2RT 6a] 2k, 
= leaap leapt 


1.5. Implicit Functions.—In the preceding discussion, the dependence 
of one variable on another has been given in explicit form, as z = f(y). 
Let us assume the relation between the variables to be given in implicit 
form such as f(z,y) = 0. If it is now desired to compute dy/dx, one could 
solve f(x,y) = 0 for y and then differentiate. This procedure, which is 
often needlessly complicated, may however be avoided, for, according 


to (4), 
df = 6) dz + (5 dy = 0 (1-13) 


and 


If the equations for a circle, z? +y? — a? =0, or an ellipse, 
x*/a? + y°/b? ~ 1 = 0 are taken for f(x,y) = 0, the advantage of using 
this method to obtain derivatives is at once evident. 

If an implicit relation is given between three variables, F(z,y,z) = 0, 
any one may be considered to depend on the other two, for there are three 
possible relations 


z = fly); y = glz); z= h(xy) 
If z be taken as the dependent variable, then 
dF = Fdz + Fidy + Fdz = 0 


At constant y, dy = 0, so that 


oz F, 
-—} = =- 1-14 
(= Fa ( ) 
at constant z, dz = 0, hence 
ôT F, 
~) =- 1-15 
(= z Fy, ( 
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A third possibility arises if two relations are given between three vari- 
ables 
J (z,y,2) =0 


g(x,y,2) =0 
Then 
df = fadz + fydy + fzdz = 0 


dg = gdz + g,dy + g.dz = 0 


Solving these two equations, we obtain (see sec 10.9) 


Safe | [Sele | | Safu 


dz : dy : dz = 


Gy Gz 9x Gy 


Further examples of the properties of implicit functions and their deriva- 
tives will be found in the discussion of thermodynamic quantities. 

1.6. Implicit Functions in Thermodynamics.—The simplest thermo- 
dynamic systems are homogeneous fluids or solids, subjected to no external 
stresses except a constant hydrostatic pressure. Investigation shows that 
for all such systems, there is an equation of state or characteristic equation of 
the form 


gz Jz 


S@,V,T) =0 (1-16) 
where p is the pressure exerted by the system, V is its volume and T, its 
temperature on some suitable scale. From (16), an equation of the form of 
(13) may then be obtained. 

df = (8f/dp)y,rdp + (af/AV)prdV + (af/aT),y dT = 0 
Setting dp, dV, dT equal to zero, successively, there results a set of equations 
similar to (14) and (15) 


(=) (af/dT) py 1 


aT),  (@f/3V)r (8T/8V)p 

aT) __ @/apny _ 1 g 
G, = a/y ~ Op/at yy (1-17) 
(2) _ — (f/8Vnr E S 

Vr @f/ðp)ry  (dV/dp)r 


Three possible products may be found by multiplying any pair of these 
equations and removing the common terms. A typical one is 


OPV (F) a (2 i 
5) G), E ari. (1-18) 
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The product of all three derivatives is 


OOO ow 


These results are of considerable importance since they are verified by 
experiment, the derivatives being proportional to such physical quantities 
as the coefficients of compressibility, thermal expansion and temperature 
increase with pressure. 

1.7. Exact Differentials and Line integrais.—It is often required, in 
thermodynamic problems, to find values of a function u(x,y) at two points 
(21,41) and (%2,ye) by integration of an equation 


du(zy) = M(a,y)dz + N (xy)dy (1-20) 
between the limits uy and uz. 
The attempted integration results in such a symbol as f M (x,y)dz, 


which is meaningless unless y can be eliminated by a relation, y = f(x). 
This is equivalent to specifying the path in the XY-plane along which the 
integration is performed, hence integrals of (20) are known as line integrals. 
There are many of these paths, the value of the definite integral differing 
in general, for each. The situation is particularly simple when du is a zotal 
differential, or, as ıt is often called, a complete or exact differential. Com- 
parison of (4) with (20) shows that in this case 


M (x,y) = ðu/ðz; N(x,y) = ðu/ðy (1-21) 


Moreover, since the order of differentiation is of no importance, it follows 


that 
aM /ay = d°u/axdy = aN /az (1-22) 


Inspection of (21) shows that u may be found by integration even when a 
functional relation between + and y is unknown. In other words, the line 
integral is independent of the path; it depends only on the values of z and 
y at the upper and lower limits. The function u is then said to be a point 
function. 

In thermodynamics, it frequently happens that the upper and lower 
limits are the same, that is, the integration is performed around a complete 
cycle. If the differential du is exact, then the value of the line integral is 
zero; if du is inexact, integration around a closed cycle gives a result not 
equal to zero. 

1.8. Exact and Inexact Differentials in Thermodynamics.—Examples 
of exact and inexact differentials are readily found in thermodynamics. 
Consider a mole of an ideal gas, whose equation of state is pV = RT. Let 
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the initial conditions be Vi, pı and T4 and the final conditions be Vo, po 
and Tə. Calculate the change in volume and the work done in going 


T 


Fra. 1-1 


from the initial to the final state, the integration ‘being along two different 
paths in each case. Since V = f(p,T), 


ð ð 
ev = (3r), Gp), 


R RT 


=— dT ——; dp (1-23) 
p p 
Let the first equation of path (AC in Fig. 1) be 
Ta- T AT 
T-T - (= 73) (p-p) = —— p-m) 
P2 — Pı Ap 


AT 
Then dT = Ap dp and (23) becomes 


_ pl Ala AT aig] 
av = R| (2: Ap") p Ap p 


or, on integration, 
R(Top1 — pets) 


Va — Vi = AV = 
PıP2 
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The second path will be considered as consisting of two parts: AB and BC 


(ef. Fig. 1). 
Along path AB, T = Tı, dT = 0 and along BC, p = po, dp = 0, hence 
d R 
av = -RT £ += ar, 
p P2 
or 


_ R(Tep, — p2T 1) 
Pip2 


The change in volume is thus the same for these alternative paths. 
A similar conclusion might have been drawn from the test for exactness: 


M =R/p; N = —RT/p’ 
aM EN 
ð pP T 
which shows that (23) is exact. 
The mechanical work done by an expanding gas is 


dW = pdV (1-24) 


regardless of the shape of the container and provided that the expansion is 
performed reversibly? in the thermodynamic sense. Combining (24) with 


(23) we obtain 
ov ð 
dW =p G) dT +p B, dp 


AV 


T 
= RdT — = dp (1-25) 

lt is clear that dW is inexact since 
RT aM oN R 
M = R: = — ee S — = 0 — = == — 
N p’ op * oT p 

By path AC, 
AT dp AT 
dW =R ar —(r,- 22 — — — | 
| 1~ ip pı p Ap dp 


and, on integration, 


AT 
Wa — Wi = AW; =r (ep, ~7,)mn” 
Ap Pr 


2 Here and elsewhere in this chapter, we assume that all processes are performed 
reversibly when such requirement is needed for the argument. For discussions of 
reversibility, texts on thermodynamics‘should be consulted. 
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Along paths AB and BC, 


aW = r| -7% +ar| 
or 


AW: = R| 7, In 2 4 ar] 
P1 


Comparison of AW, and AW» shows that the work is different along the 
two paths. 

Heat absorbed or evolved in a process, dQ, also depends on the path. 
The expression for the inexact differential with p and T as independent 


variables is 
oQ =) 
=(= =) a 
ae (sr), ane (G rP 
= C,dT + Apdp (1-26) 


where C, and A, are the continuous functions of T and p, known as the heat 
capacity at constant pressure and the latent heat of change of pressure, 
respectively. 

Problem. Connect the points pı, V ı and pz, V2 of Fig. 1 with a circular arc. Inte- 
grate (23) along this path. 

1.9. The Laws of Thermodynamics.—There are obvious advantages in 
expressing the laws of thermodynamics in terms of quantities which are 
independent of the path. As we have seen, both dQ and dW are inexact, 
but the difference between them, a function known as the internal energy 


dU = dQ — dW (1-27) 
is an exact differential. This equation‘ often serves as a statement of the 


first law of thermodynamics. By combining (25) and (26) we may also 
write 


aVv ôV 
dU = [o —p | dT + E -p z] dp (1-28) 


with the additional requirement of exactness from (22) 


ð ð ð ôV 
pl ea alea] (29) 


3 This fact was recognized by Clausius, “ The Mechanical Theory of Heat,” trans- 
lated by W. R. Browne, Macmillan & Co., London, 1879, who discusses the laws of 
thermodynamics from this standpoint. 

4 Note that +dQ means heat absorbed and +dW work done by the system. Minus 
signs indicate heat evolved or work done on the system, 
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These two equations are a more satisfactory definition of the first law than 
(27) since they show the essential fact that the internal energy, dU, is an 
exact differential. The inexactness of dQ and dW is sometimes indicated® 
by stating the first law in the form of (27) with symbols such as dQ, DQ, 


or 6Q on the right. 
The second law of thermodynamics is based upon an attempt to find a 


function of dQ which is an exact differential, From (27) and (24), 
dQ =dU +dwW =dU + pdv (1-27a) 
but U = f(V,T), hence 


a = (Sav + (3 p)ar 


dQ = (=) aT + (» + =) av (1-30) 


In passing from an initial state, Vj, Ti, to a final state, V2, T2, the integral 
on the right of (30) cannot be evaluated without further information, 
since the second term contains both p and V. In the special case of an 
ideal gas where pV = RT and (8U)/(8V)7 = 0, (30) becomes 


dQ = GF) Si ae (1-31) 


The first term on the right of this expression is the heat capacity at constant 
volume and depends on the temperature alone. If therefore we make the 
further restriction of constant temperature, that is, assume the process to 
be isothermal, the integral may be obtained. The form of (31) suggests that 
if we divide by T, the resulting equation 

dQ ifa RadV 

— =—(—-]) dT 

T T (7). ty V 

may also be integrated when T changes. The more general inexact differ- 
ential (26) when divided by T is also exact, the quantity S so defined being 
the entropy 


dQ Cc 
d -Z - Bars — dp (1-82) 


The condition for exactness 


a (rt) ar (#) a- 


ë The question of a suitable notation for use in thermodynamics has been discussed 
by Tunell, G., J. Phys. Chem, 86, 1744 (1932); J. Chem. Phys. 9, 191 (1941); see also, 
Menger, K., ‘Am. J. Phys. 18, 89 (1950). 
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together with (32) serve as basis for a statement of the second law. Our 
arguments concerning the first and second laws are intended only to show 
their property of exactness. The most satisfactory formulation of these 
laws is probably that of Carathéodory. We consider this subject in 
sec. 1.15. 
The functions dU and dS may be combined by using (24), (27) and 
(82), to give 
dU = TdS — pdvV (1-34) 


Since U =f(8,V) (1-35) 


and dU is exact, we may also write 


ð ð 
dU = C), dS + Z ; aV (1-36) 


Comparison of (34) with (36) shows that 


ð ð 


The importance of (35) arises from the fact that if U is known as a function 
of two independent variables, S and V, it is possible to caleulate numerical 
values of p, T and U for any thermodynamic state when S and V are given. 
A quantity like U thus furnishes more information than the equation of 
state, for the latter will only give p, V and T; in order to obtain U and S, 
the heat capacity as a function of temperature must also be given. Itis not 
necessary to choose S and V as the independent variables in (35) or (36), in 
fact any pair of theset: p, V, T, S (or of the functions to be defined immedi- 
ately) may be taken, but the resulting exact differential is simpler when S 
and V are selected. 

When the conditions of a specific problem suggest another pair of inde- 
pendent variables, it is more convenient to define additional thermodynamic 
functions. These are given in the following relations, where the symbol as 
used by Gibbs precedes the one now customary.® 

The heat content or enthalpy, x = H =U + pV 


aH = dU + pdV + Vdp = TdS + Vdp (1-37) 
The work content or Helmholtz free energy, = A = U — TS 
dA = dU — TdS — SdT = —SdT — pdy (1-38) 


ê Gibbs preferred S and V as independent variables for reasons given in loc. cit., 
footnote on page 34. 
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The free energy or Gibbs thermodynamic potential, 
t=F=.U-—TS+pVv 
dF = dU — TdS — SaT + pdV + Vdp 
= —SdT + Vdp (1-39) 


As in the case of dU, any pair of the set: p, V, T, 8, U, H, A, F may 
be chosen as independent variable, but the exact differential is simpler when 
expressed in terms of the functions shown in the last equation of (37), 
(38) or (39). Since most experimental work is done at constant pressure 
rather that at constant volume, it is obvious that H and F (where the 
pressure is one of the independent variables) are more generally useful 
than U and A. The whole of the thermodynamics of systems of constant 
composition may be developed, however, using any one of the following sets 
of variables: (1) U, S, V; (2) H, S, p; (8) 4A,7,V; (4) F, T, p. 

It is frequently necessary to have some means of predicting the direction 
in which a system spontaneously approaches a state of thermodynamic 
equilibrium. Let us consider two bodies, one at a temperature T, and the 
other at a lower temperature Ta. Then if the whole system is surrounded 
by adiabatic walls so that no heat enters it, we may write 


dQ. 4g _ 4 
Ty’ To 

where dQ is the heat absorbed by the colder body. The total entropy of the 
system thus increases, for 


aS = dS; + dSz = dQ 


dS, = — 


(Tı — Te) 
T1T2 


Clearly dS = 0 when thermal equilibrium is reached. From (39), we also 
see that at constant temperature and pressure, dF = 0 when equilibrium 
is established. Since the entropy reaches a maximum, the free energy 
simultaneously reaches a minimum. In Table 1, we collect the criteria 


>0 


TABLE 1, DEPENDENT VARIABLE BECOMES A MINIMUM 
Independent Variables Fixed Dependent Variable 


YAS Gh 


DEPENDENT VARIABLE BECOMES A MAXIMUM 


U, V or H, p 
FP, Tord, § P 


tr 


15 DERIVATION OF PARTIAL THERMODYNAMIC DERIVATIVES 1.10: 


for spontaneous approach to equilibrium when various pairs of the inde- 
perdent variables are held constant. 


Problem a. Find expressions for S, H, V, A, U in terms of set (4). 


Ans. S = —ðF/ðT; H =F ~ Tarsar; 
V = ðF/ðp; A =F — pdF/dp; 
oF or 
U=F— TT p ap 


Problem b. aie the somes aeons which are known as Maxwell's relations: 


O s \Os = , 6, T 

1.10. Systematic Derivation of Partial Thermodynamic Derivatives.— 
With the addition of Q and W, we have ten important thermodynamic 
quantities. The heat capacities are not included in the list, since by their 
definitions: Cp = (8Q/dT),, Cy = (0Q/8T)y, they may be readily deter- 
mined from the set of ten just mentioned. We now wish to describe meth- 
ods of obtaining all first order partial derivatives of the form (@2/dy). where 
x, y and z are any members of the set. It is immediately apparent that 
there are a large number of them for there are ten ways of choosing z, 
leaving nine and eight ways, respectively, of choosing y and z, a total of 
720 first derivatives. When all possible relations between the first deriva- 
tives are included, the total number of equations is increased enormously 
for, in general, a selected derivative may be written in terms of three other 
derivatives which are independent of each other as the following considera- 
tions show. Suppose z = f(y,w), then 


ae = (22) ay+ (#2) au 
(5). Gi). * Ge), Ga), 


There are, of course, many cases where there are relations between fewer 
than four derivatives but neglecting these, the total number of equations 
obtainable is the number of combinations of 720 derivatives taken four at 
a time, 720!/4!716! or approximately 10°°. Although many of the rela- 
tions are of little use, it is convenient to devise a systematic method for 


obtaining any of them. 
The best known of these methods is that of Bridgman’ which is simple 


and 


7 Bridgman, P. W., “Condensed Collection of Thermodynamic Formulas,” 
Harvard University Press, Cambridge, Mass., 1926. 
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and often used. It will be described only briefly since it is a special case 
of a more general procedure which we give in sec. 1.13. It is unnecessary 
to compute the 10'° relations because any one of them could be obtained if 
the 720 first derivatives were tabulated in terms of the same set of three 
independent derivatives. The particular choice of the three is arbitrary 


Bridgman having taken 


(i), Go)» Gr) 
or p op T ôT p 
because these are directly obtainable by experiment. One could then pick 
any four derivatives, write them in terms of the chosen three and eliminate 


the three derivatives from the four equations. The result would be a single 


equation containing the four derivatives. 
The 720 derivatives could then be classified into ten groups by holding 


one quantity constant and varying the other nine. Within the group 
containing derivatives at constant z, 


z) I 
G a). iE (1-40) 
ðw) z 
which follows by writing according to (11) 


Ox 0x 
BOO 
_ (24 oy 
dy = (32) au + (3) ae 


setting dz = 0 and dividing one equation by the other. It should be 
remembered that even if x and y are not functions of w and z it is still 
possible to have inexact differentials of the form of (41), hence the present 
arguments apply to dQ and dW as well as to the remaining eight thermo- 
dynamic functions. Upon adopting the abbreviations 


(22) - wo, 


(2) - ww. 


ow), 


(1-41) 


any derivative at constant z may be written in purely formal fashion by 
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taking the ratio of the proper pair, or 


(=) E (ðx)z 
dy/2 (Əy): 
The task of computing the 72 derivatives in this group is thus reduced to 
calculation of the nine quantities (éz),, (dy)z, ---. The latter are easily 
found-when several of the derivatives (dz/dy), are known in terms of the 
fundamental three for it proves possible to split the former into numerator 
and denominator by inspection. 

If each of the remaining groups were treated in a similar way, 90 expres- 
sions of the form (dz)z, (éy)z, (0%), ++ - would be obtained but in every 
case (dr), = — (dy), so that the final list need contain only 45 relations; 
they-are given by Bridgman (loc. cit.) in convenient tables.2 The follow- 
ing examples show their use. Let it be required to calculate (67/dp) y. 
From the tables, (3T)g = V — T(@V/dT)>, (pja = —C>, thus 


Ge aL-¥ +? Gr), 


Many alternative forms are easily found, for example, 


@T/AS)p = TICs; OP/ap\s = F(Z) + @S/2P)x = -V/T 


hence, 
(2), =), +@ 
pjg \op/Hx\98/_ \dp/s 


Additional examples, tables for a few of the second derivatives, and exten- 
sion of the method to include mechanical variables other than pressure 
have also been given by Bridgman. 

A further amplification of the method has been presented by Goranson? 
whose tables include the following cases: (1) one-component unit mass 
systems (constant total mass); (2) one-component variable mass systems 
or two-component unit mass systems; (3) two-component variable mass 
systems or three-component unit mass systems; (4) three-component vari- 
able mass systems or four-component unit mass systems. Simplified methods 
for constructing such tables have been proposed by several authors,!° 

1.11. Thermodynamic Derivatives by Method of Jacobians.—A more 
general method which is based on the properties of functional determinants 


8 For abbreviated tables, see, for example, Slater, “Introduction to Chemical 
Physics,’ McGraw-Hill Book Co., New York, 1939; or Glasstone, loc. cit. 

? Goranson, Roy W., “ Thermodynamic Relations in Multi-component Systems,” 
Carnegie Institution of Washington, Washington, D. C., 19380. 

1 Lerman, F., J. Chem. Phys. 6, 792 (1937); Tobolsky, A., ibid., 10, 644 (1942): 
Bent, H. A., ibid., 21, 1408 (1953); Carroll, B. and Lehrman, A., J. Chem. Ed. 24, 389 
(1947), 
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or Jacobians has been described by Shaw.’ The mathematical basi: 
which it is founded will be discussed in detail in order to explain the « 
struction of the required table and its application to specific examples. 

1.12. Properties of the Jacobian.—The Jacobian”? of x and y 5 
respect to two independent variables, u and v, is defined by 


J (z,y/u,0) = (x,y) /8(u,v) = 


(i). O. 
Gn). Gi), 


When the independent variables are discernible from the context, 
Jacobian may be abbreviated as J(z,y), the second form of (42) be 
reserved for cases where it is necessary to give the independent variak 
explicitly. The following properties are obtained directly from the di 
nition of the Jacobian: 


J(u) = —J(v,u) = 1; 
Jizz) =0; J(k,xz) = 0; k, any constant (1-4 
F(ay) = F(y,—2) = J(-y,r) = —J(y,2) 


A further important property of the Jacobian arises if z and y are expli 
functions of z and w, which in turn are explicit functions of u and v. Wr 
ing 6(2,y)/0(z,w) and ð(z,w)/ð (u,v) in determinant form, using the rule f 
the multiplication of determinants, the abbreviations (62/0z)y = 2, a1 
80 on, we have 


Te Ly Zu Sy Lily + 2yWy Lay + Ly Wy 
x = 
Ys Yw Wy Wy Yau F YoWs Yato + YuWo 
A typical element of the product 


ETETE 
dz w du v ðw 2 ðu v ðu m 


H Shaw, A. N., Phil. Trans Boy. Soc. (London) A284, 29 
» A. N., Phil. Trans. Roy. Soc. , 209-328 (1935), 
12 The properties of determinants, which are used here, are discussed in Chapter 1 
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the last form resulting from (8), hence 


Tu Ly 
lzy) ., 9(z,w) (zy) 
dew) * awe) ~ ~ alua) G~) 
Yu Yv 
In tbe important special case, y = v, 
Zu Ty 
a(zy) _ (2) 
a(uy) = tty = \ ; (1—45) 
Yu Yy 


for 


Since many thermodynamic functions are of the form f (x,y,z) = 0, where 
any one variable is determined by the other two, we may write from (4), 


âz ĝz 
BOLO 


ô(z,y) ð (z) 
= a DO: dy 
ô (zy) d(y,2) 
Expressing each of these variables in terms of two new independent vari- 


ables, r and s, and using the abbreviations J(z,y) = 4(z,y)/8(r,s), etc., 
(44) enables us to write 


or using (45) 


_ Jey), , Tee) 
= Tem * Ty) 


dy 


If we multiply by J (x,y), 
J (z,y)dz + J(a,z)dy + J(y,z)dz = 0 (1-46) 


since J (x,y) = —J(y,x), etc., from (43). If two more variables, u and v, 
are related to r and s in the same way, (46) may be divided by du at con- 


stant v, giving 


Gj ð 
ren (q+ 469 Ga), + 70 Ga), = 
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So that finally, again because of (45) 
Jz) Jew) + I(E) (yn) + IUT ew) =0 (1-47) 

Problem. If r, s are functions of z, y, z and the latter in turn are functions of the 
independent variables v, v show that 

J (r,8/uv) = J (r,8/2,y)F yun) + J r,8/y,2)9 elu) + J 0,8 /2,2) J (2,2/4,2)- 

1.13. Application to Thermodynamics.—This last equation is the 
important one which determines all of the thermodynamic partial deriva- 
tives, for if two independent variables, r and s, are chosen which cona— 
pletely determine the others, x, y, z, v, then any one Jacobian, for example 
J (x,y), is given in terms of five others. But if r and s are taken from the 
set x, y, z, v, then J(z,y) is given in terms of only four others, since by 
(47) J(r,s) = 4(r,s)/a(7,s) = 1. 

Let us choose p, V, T and S for z, y, z and v, respectively, so that 


J(T,V)I (pS) + J(p,T)I(V,S) + JV p)J (7,8) =0 (1-48) 
One more reduction is possible since from (34), 


(0U/aV)s = —p; (aU /dS)y =T 
and 
(8U /3 Sa V) = (6T/dV)g = —(dp/dS)y 


In Jacobian notation, 
J(T,8)/I(V,8) = ~JI,V)/J(S,V) 
Finally since J(V,S) = —J(S,V) from (43), we obtain 
J(T,S8) = J(p,V) 


When the following abbreviations 


a= J(V,T) 
b= J(p,V) = J(T,5) 
c = J(p,S) (1-49) 
l = J(p,T) 
n = J(V,S) 
are substituted into (48) and (43) is used to change the signs, we have 
+ ac— nl = 0 (1-50) 


Tt is convenient to list the various Jacobians in rows and columns, 
J(2,y) occurring at the intersection of row z with column y. The upper 
left-hand block of such a table is immediately filled by using the definitions 
(49), the rule for the change of signs, and the fact that J(z,2) = 0 from 
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(43). The entries for the lower left-hand corner of the table are obtained 
by writing the definitions of dU, dH, ete., in Jacobian form. For example, 
since 
dU = TdS — pdV 
J(U,z) = TI(S,2z) — pJ (Vz) 

where z is any required variable, Hence, if z is taken as p and then as V 

JU,p) = TJ(S,p) ~ pJ (Vp) = —Te + pb 

J(U,V) = TJ(S,V) — pJ(V,V) = —Tn 
the last forms following from the part of the table which is already filled 
or from the definitions in (49). The upper right-hand corner may be filled 
at the same time, without further calculation, by changing all signs. The 
table is completed by using relaticns already found, as for example 

J(A,H) = —J(H,A) = —SJ(T,H) ~ pJ(V,H) 

—S8(Tb — Vl) — p(Tn — Vb) 
—T(Sb + pn) + V(SI + pb) 


The final result is shown in Table 2. The use of it is typified by the 
following examples. 

Example 1. Evaluate (OF /8T)y in terms of other partial derivatives with 
T and V as independent variables. In Jacobian notation and from Table 2 


(@F/dT)y = J(F,V)/I (T, V) = — ———- = —8 — Vb/a 
But 
b/a = J(p,V)/J(V,T) 


hence, 


i] 


—J(p,V)/J(T, V) = —(op/aT vy 


(OF /8T)y 


—S + V(dp/dT)y 


Example 2. Transform the result of the preceding example into deriva- 
lives with p and S as independent variables. If the previous result is used, 
the term a causes trouble, since with p and S as independent variables, we 
obtain a = J(V,T) = 4(V,T)/@(p,S), a relation which cannot be reduced 
to a single derivative. In general, as we have shown, any partial derivative 
may be expressed in terms of not more than three other derivatives of 
thermodynamic functions. We therefore use (50), which gives a = 
(nl — b*)/c, or, 


(6F/aT)y = —S — Vbe/(nl — b?) 
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But 
b = J(p,V) = 3(p,V)/d(S,p) = —(8V/a8)p 
c= J (p,8) = 8(p,S)/d(8,p) = —1 
l= J(p,T) = a(p,F)/a(8,p) = — (87/98) 
n = J(V,S) = 8(V,8)/d(S,p) = — (dV /3p)s 
hence, 


a-g y| OV» _ 
(0F/3T)y = —8 A - GTS] 


This procedure may be repeated using other quantities, such as T and 
S, V and p, and so on, as independent variables. The difficulty in choosing 
the proper form of the original relation may usually be removed in the 
following way. Referring to the definitions of a, b, c, 1 and n, it is seen 
that each can be reduced to unity by a proper choice of the independent, 
variables. For example, if the latter are chosen as V and T, a = 1, since 
a=J(V,T). In the previous case, c = —1, and it was found advisable 
to use some quantity other than a. The situation may be summed up in 
the following directions, In case one of the letters in the top line of the set 


in . . : 
[z = z z] equals unity, do not use the one directly beneath it but trans- 


form to another by means of (50). In this way, the resulting expression 
will usually contain only three different partial derivatives. The omission 
of b from the above list arises from the fact that even if b = 1, only single 
derivatives will occur. 

Example 3, Solve for (p/dT)y in terms of Cy, Cp and u = (8T/ap) x, 
the Joule-Thomson coeficient. Problems of this sort frequently arise where 
it is desired to express a partial thermodynamic derivative in terms of other 
quantities, which are measured directly. The usual process of obtaining 
the relationship is tedious and complex. From the table, it is found that 


Cy = (6Q/8T)y = Tn/a 
C, = (00/87) » = Te/l 
u = (OT/dp)x = (Tb — V1)/Te 
(dp/dT )y = —b/a 


Since there are three relations given and only two letters in the last deriva- 
tive, it is convenient to write this in the form 


(dp/8T)y = —b?/ab 
and to solve for a, b and b? in terms of Cy,C, and». Using (50) to obtain 
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a relation between Cp and b’, we have 
Cp = T(nl — b) Jal 
a=Tn/Cy; b? = la(Cy — Cp)/T; b =luC, + V)/T 
and finally 
(8p/ðT)y = (Cp — Cr)/ (Cpu + V) 


Example 4. Determine (0U/dV)r for a gas obeying (i) the ideal gas 
law, pV = RT; (ti) van der Waals’ equation, (p + a/V°)(V — 8) = RT. 
In problems of this sort, the resulting formulas usually contain no more 
than one partial derivative instead of three as in the earlier cases. From 
Table 2, 


If p and V are taken as independent variables, 


, _ ôT) oT 

beak a= IVT) = 30,7) G), 

, Vo paN | 

© s=- (F),-° 

` (V-B). [ƏN _ BRT a 
®) oR OiT P = ya 


In Shaw’s paper (loc. cit.), auxiliary tables are given to simplify the calcu- 
lations for the following cases: the ideal and van der Waals’ gas, the 
saturated vapor, black-body radiation. 

The Jacobian method has been extended by Shaw to include second 
derivatives and to apply to systems of variable composition. For these 
applications, as well as more detail on the use of the tables, the original 
paper should be consulted.!? 


Problem. Prove the following relations: 


TORTOR 


© r= e r GA) Ge), 


1.14. Thermodynamic Systems of Variable Mass.—The development 
of thermodynamics up to the time of Gibbs may be briefly summarized by 
the equation of Clausius (34) which combined the two laws. The subject 


13 The Jacobian method has been described and illustrated with numerous examples 
by Sherwood, T. K., and Reed, C. E., “Applied Mathematics in Chemical Engineering,” 
McGraw-Hill Book Co., New York, 1939; see also, Crawford, F. H., Am. J. Phys. 17, 
1 (1949). 
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was thus confined to systems of constant iotal mass. Gibbs showed how 
this equation could be extended to include systems of variable mass.1* If 
we consider a system composed of several substances whose masses are 
My M- we may change the internal energy not only by varying the 
entropy and the volume but also by varying the relative masses. Thus in 
place of (85) we have 


U= U (S, V,m, mo, ` Mn) 
and in place of (36) 


ðU ð 
dU = ({— -5 
CSN Te aS + (Fem av 


ðU aU 
Bt) svim. Noma) swm. tT (1-51) 
If we write 
(= E 
ami] Y Som Mi (1-52) 
we have 
aU = TaS — paVv -+ pdm + bed mg + eae (1-53) 


If dU is eliminated from (53) by using in turn equations (87), (88) and 
(39) we obtain 


oH 0A oF 
ui = | — = = (1-54) 
ôm; S.pymymy +++ om; VT mumy + om; pT mma «> 


The partial derivatives defined by any of these equivalent expressions were 
called by Gibbs the chemical potentials. We may also convert (53) into 
the equation 


dF = ~8dT + Vdp + mdm, + mdm +--- (1-55) 


At constant temperature and pressure and for a reversible process, as we 
have shown, dF = 0; hence according to (55) the condition for equilibrium 
reads 


dF = mdm + pdm +--- =0 (1-56) 


From this equation we may derive the celebrated phase rule of Gibbs. 
Let us understand by phase a homogeneous part of a system separated from 
the rest of the system by recognizable boundaries. Thus a mixture of ice, 
liquid water, and steam is a system of three phases. The number of 


14 His results also included other variables such as electric, magnetic, and gravita- 
tional fields as well as surface phenomena. 
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components is the least number of independently variable constituents 
required to express the composition of each phase. In our previous exam- 
ple there is only one component. In a system composed of an aqueous 
solution of sugar there are two components for it is necessary to specify 
the amounts of both water and sugar present. Finally we need a definition. 
of degree of freedom. It is the number of variables (such as temperature, 
pressure, composition of the components) which is required to ‘describe 
completely the system at equilibrium. For example, liquid water in the 
presence of water vapor is a system of one degree of freedom, for we may 
vary either the temperature or the pressure but we cannot change both 
simultaneously for then either the liquid or the vapor disappears, 

Suppose a system contains C components and P phases, then an equa- 
tion of the form of (55) will hold for each phase. Since F like S and V is an 
extensive variable, it follows from (55) that the chemical potentials must be 
independent of the masses, so that we may integrate (56) term by term 
obtaining 


F = pm + pms +++ + ume (1-57) 


Differentiation of this equation results in 


GF = mdm + mam + +++ + podme 
+ myduy + madus + +++ + medge 


When it is subtracted from (56) we get 
mdu + Midus + +° + medue = 0 (1-58) 


Equilibrium can be established only when an equation of this form holds 
for each of the P phases. But there are C + 2 variables T, p, m, pa 5, 
Ho, hence the number of degrees of freedom f is 


f=C+2-P (1-59) 


This simple equation has been of inestimable value in the study and inter- 
pretation of heterogeneous equilibrium by the chemist, physicist and 
metallurgist.'® 

1.15. The Principle of Carathéodory.—In most textbooks of thermo- 
dynamics, the order of presentation parallels the historical development 
of the subject. For this reason, considerable attention is paid to several 
kinds of ideal or imaginary machines. The customary procedure is to 
cite, first of all, the impossibility of constructing perpetual motion machines 
of various types; when this is granted it is possible to state the conditions 


Z Such applications, where graphical methods are normally used, are discussed by 
Ricci, J. E., “ The Phase Rule and Heterogeneous Equilibrium,” D. Van Nostrand Co., 
Ino., New York, 1951. Some mathematical methods for treating multicomponent 
systems have been given by Dahl, L. A., J. Phys, & Colloid Chem. 52, 698 (1048); 
54, 547 (1950). 
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under which real machines may operate and to derive the whole body of 
positive assertions which are incorporated into the science of thermo- 
dynamics. The critical student may feel the need of a more logical and 
formal approach, and this will now be given. 

We have attempted to emphasize in sec. 1.9 one important mathe- 
matical consequence of the laws of thermodynamics, namely, that func- 
tions such as dU and dS are exact differentials. We now wish to discuss a 
more fundamental mathematical property of these laws which was dis- 
covered by Carathéodory. His arguments!® are derived from the geometric 
behavior of a certain differential equation and its solution. As a result, he 
is able to obtain in a purely formal way the laws of thermodynamics with- 
out recourse to fictitious machines or such objectionable concepts as the 
flow of heat. We cannot reproduce here the complete theory?’ but shall 
only give the mathematical details of his treatment of the second law. 

Let us assume that a thermodynamic system is composed of n separate 
parts, each one of which is characterized by its pressure and volume. Fur- 
ther, suppose that the whole system is surrounded by adiabatic walls or 
thermal insulators while the individual parts of the system are separated 
from each other by wails that are perfect conductors of heat. As a result 

_of experiment, it is found that there is no observable change in the system 
(Le., equilibrium has been reached) when the following conditions are met: 


JPV) = Sapa V2) =-+: =Jn(PayVn) = F) (1-60) 


The relation f;(p;,V;) = F(8) for the i-th part of the system is, of course, 
an equation of state, and # is the temperature of the whole system on some 
suitable empirical scale. According to the first law (see eq. 27a) 


dQ = dU + pdV =0 (1-61) 


the whole system being adiabatic. Moreover, a similar equation holds for 
each part of the system: 


dQ; = dU; + pav; (1-62) 
and 
dU = 2 dU;; dQ =X dQ; (1-63) 
iz inl 


As we have shown, dQ; is not an exact differential. However, it de- 
pends on only two variables, and under these conditions an infinite number 


16 Carathéodory, C., Math. Ann. 67, 355 (1909). 

17 Carathéodory’s theory has been reviewed by Born, M., Physik. Z. 22, 218, 249, 
282 (1922) and by Landé, A., “ Handbuch der Physik,” Vol. IX, Chapter 4, J. Springer, 
Berlin, 1926. See also, Buchdahi, H. A., Am. J. Phys. 17, 41, 44, 212 (1949). 
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of integrating denominators exist.!® Hence eq. (62) may be converted into 
an exact differential, Let an integrating denominator be t;, so that 

dd: = dQ;/t; (1-64) 
is exact. Clearly ¢; is then a function of the state of the system, hence we 


may change (61) in such a manner that the independent variables are & 
and @; instead of U and V. The result of this transformation is 


nal (OU; OV; ou; ôV: 
= 2 ot . L ig 4 = -65 
dQ E + pi Ta) das + (E + Pi a) aa | o (1-65) 


The quantity dQ is not exact, nor is it to be taken for granted that it can 
be made exact by the use of an integrating denominator if dQ contains 
more than two variables. As a matter of fact, the procedure is possible 
only when the differential equation dQ = 0 (known as a Pfaff equation) 
possesses a solution, as we shall show in sec. 2.18. In that case (and we 
shall here be interested in no other), there is an integrating denominator ¢ 
such that 


do = dQ/t (1-66) 


is exact, even when there are n variables. More important for our present 
needs is the conclusion drawn from simple geometric considerations that if 
there is an integrating denominator, then there are in the neighborhood of 
any point P many other points which are not accessible from P along the 
path dQ = 0. This formal mathematical consequence of the properties 
of the Pfaff equation is known as the principle of Carathéodory. It is 
exactly what we need for thermodynamics. Consider, for example, a gas 
at a given pressure, pı and volume, V;. We may expand or compress this 
gas adiabatically (i.e., along the path dQ = 0), but the final state of the 
system will be characterized by variables po, V2 which we cannot choose at 
wil. There are many values of p and V which we are not able to realize 
adiabatically. 

We refer the reader agam to sec. 2.18 for the conditions under which 
equations like (65) have a solution, hence an integrating denominator. 
We proceed here with the physical results which may be obtained when we 
know that the integrating denominator exists. In order to simplify the 
situation let us assume that the thermodynamic system is composed of 
only two parts. This restriction does not mean that there is any loss in 
generality of the final results since all our arguments could easily be 
extended to cover a system of any number of parts. Withn = 2, it follows 


18 The proof of this fact as well as other mathematical conclusions reached here 
are given in sec. 2.18. Except for the proofs, the present section is complete in itself. 
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from (63), (64) and (66) that 
ido = hidhi + todda (1-67) 
If we take as in (65), $1, ¢2 and ? as independent variables we see that 


ap i, d t, a 

dd, i’ ðe it’ O98 (1-68) 
The last equation of (68) shows that depends on ġı and ¢2 but not on 8, 
so that according to the other two equations of (68), the ratios t/t and 
t,/t are also independent of 3: 


a ft a [ts 

—{2).20: —(2)= 

2 (3) , z (2) 0 
This result may be written: 


-A a nhi (1-69) 


Now t: is a function of the state of the first member of the system and there- 
fore could depend only on ¢; and 0, while t could depend only on ¢2 and ð. 
However, (69) indicates that 4 and to must actually satisfy the following 
equation 


dln t din ty dlnt 


a 7 ae 7 ay 7 9) (1-70) 


where g(#) is a function which is common to all systems in thermal contact, 
not dependent on any special properties of the substances which compose 
the system. Integrating (70), we obtain 


In i = f gde +n Al) (1-71) 


where the integration constant In A depends only on the quantity ¢. 
Note that we have dropped the subscripts from ¢ and ¢ so that eq. (71) 
refers to any thermodynamic system and ¢ is the appropriate integrating 
denominator for the particular system under consideration. We see from 
(71) the important fact that this denominator can be separated into two 
parts, one depending only on the empirical temperature 3 and the other 
only on variables of the state of the system such as ¢ whose differential is 
exact, 
Let us rewrite (71) in the form 


SO gd)dd 


t = Ae (1-72) 
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and define the absolute temperature T by the relation 


TO) = Ce OY (1-73) 


The constant C relating 3 and T may be determined by requiring that 
between two fixed points, say the boiling pomt and freezing point of water, 
T shall increase by 100 units. It should be noticed that there is no additive 
constant in (73), so that if C is positive, the smallest value of T is zero, and 
there is no upper limit for T. 

If our thermodynamic system contains only one part, we may use (72), 
(73) and (66) to write 


T 
dQ = tag = E (1-74) 
Also, if we put 
S= Š TEO -+ const. (1-75) 


we obtain the well-known expression for the second law of thermodynamics 
which defines a change in entropy, dS: 


dQ = TdS (1-76) 


The entropy is immediately seen to be a function of the state of the system, 
constant along an adiabatic path (dQ = 0). Itis determined except for an 
additive constant. We also note from (76) that the absolute temperature 
is an integrating denominator of the inexact differential dQ. 
When the system is made up of two parts which are in thermal contact, 
egs. (67) and (74) may be combined to give 
Ado = Adı + Azdġz (1-77) 


We know that A, is a function of ¢; and that A» is a function of do. We 
want to prove that A is a function of ¢ which in turn depends on ¢; and 
da. Let us assume that A = A(¢). Then 


If we eliminate 34A /3¢ from these two equations we obtain 


aA dg _ Adb o (1-78) 


This result is often written in the Jacobian notation of see. 1.12 
J(A,¢/61,¢2) = 0 
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It tells ust? that if A is a function of $, J (Ap) = 0 and conversely if 
J(A,¢) = 0, then A is a function of ¢. We can easily prove in our case 
that the Jacobian does vanish. Differentiation of (77) results in 


og ag 

Æ -A W 

ddi ly A dbo Ae 
3A ab y 86 _ 4, GA ad s 9 
Oo, Ad2 O618g2 ” Abe Ody 82061 7 


hence by subtraction we obtain (78). Thus A is a function of ¢ Under 
these conditions we have an equation similar to (76) for each part of the 


thermodynamic system, and since dQ = X:dQ,;, we finally conclude from 
(75) and (77) that dS = XidS;. 


19 This result which may be applied in the case of n variables is often useful. If the 
n functions yi, Y2, *' `, Yn are not independent of each other the Jacobian vanishes; if 
J = 0, then the n functions are related by some equation f(yi, yo, *** , Yn) = 0; se 
sec, 3.13. 
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CHAPTER 2 
ORDINARY DIFFERENTIAL EQUATIONS 


2.1. Preliminaries—The customary classification distinguishes two 
main types: ordinary and partial differential equations. The former 
contain only one independent variable and, as a consequence, total deriva- 
tives. They represent a relation between the primitive of the dependent 
variable (y), its various derivatives, and functions of the independent 
variable (x). Partial differential equations, whose study will be reserved 
for Chapter 7, contain several independent variables and hence partial 
derivatives. Concerning terminology, the following is to be noted in 
connection with ordinary differential equations. 

The order of a differential equation is the order of its highest derivative; 
its degree is the degree (or power) of the derivative of highest order after 
the equation has been rationalized, i.e., after fractional powers of all 
derivatives have been removed. Thus the equation 


d? dy\? 
sat = + zy =0 


is of the second order and the first degree, while 


a d 


is of the second order and the second degree. If the dependent variable 
and all its derivatives occur in the first degree and not multiplying each 
other, the equation is said to be linear. The solution of an equation of 
n-th order involves, in principle, the carrying out of n quadratures or inte- 
grations. Since each of them introduces one arbitrary constant, the final 
expression for the dependent variable will céntain n arbitrary constants. 
However, a solution in which one or more of these constants are given 
specific values, for instance the value zero, will also satisfy the differential 
equation. In view of this consideration two types of solutions of an ordi- 
nary differential equation of n-th order may be distinguished: (1) the 
complete or general solution which contains its full complement of n inde- 
32 
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pendent} arbitrary constants; (2) particular solutions, obtainable from 
the general one by fixing one or more of the constants. In addition to 
these, differential equations of degree higher than the first frequently possess 
solutions, known as singular ones, which cannot be formed from the general 
solution in this manner. An example of these will be discussed briefly in 
sec. 2.6; they are rarely of interest in physical or chemical applications. 


FIRST ORDER EQUATIONS 


An equation of the first order can always be solved although the solu- 
tion may sometimes not be expressible in terms of familiar or named 
functions. Methods of solution applicable in the most frequently occurring 
cases will now be given, and the discussion of each method will be followed 
by a list of problems, arising in physics and chemistry, which lead to 
differential equations solvable by the scheme in question. 

2.2. The Variables are Separable.—This is true when the equation, 


d . 
which may originally appear in the form Ji (z,y) = + fo(x,y) = 0, is re- 


ducible to 

S(x)dx + gly)dy = 0 
Such an equation can be integrated at once and leads to a relation between 
y and x. 


Examples. 

a. Organic growth; radioactive decay. 
Bacterial cultures in an unlimited nutritive medium grow at a time rate 
proportional to the number of bacteria present at any moment. Hence if 
the time ¢ is regarded as independent variable and N, the number of bacteria 
present at time ¢ as dependent variable, 


aN 


“i = aN 
dt 
a being the rate of growth per bacterium. This may be written 
` N 
- 2 = adt 


1 Arbitrary constants are said to be independent if two or more of them cannot be 
replaced by an equivalent single one. Thus the constants cı and cs in the functions: 
ax + cı + cs and c,e’t? are not independent because these functions may be written 
az +c and ce”, respectively. 

This distinction is elementary. A more adequate analysis would focus attention 
upon independent solutions of the differential equation rather than independent con- 
stants. Solutions are independent when the so-called Wronskian determinant fails to 
vanish. This matter is treated in sec. 3.13. 
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which, on integration, yields la N = at +c, or N = Ce". If the original 
number of bacteria at ¢ = 0 is No, the constant C must have the value No 
to conform to this physical condition. 

Radioactive atoms decay at a rate proportional to the number of atoms, 
N, present at any moment, t. Hence dN/dt = —AN, which has the solu- 
tion N = Noe. The disintegration constant > measures the time rate 
of decay per atom. It is a fundamental quantity characteristic of each 
radioactive substance. 


b. Flow of water from an orifice. 


A vertical tank of uniform cross-section A is filled with water to an initial 
height ko. Water flows out through a hole of area a. It is desired to find 
the height of the water, h, in the tank as a function of the time, 4 The 
volume flowing out in time dż is avdt, where v is the velocity of the water 
at the orifice at time ¢. The loss of height in the tank is dh, hence the loss 
of volume Adh. Therefore 


avdt = —Adh 


But the velocity is related to the height by Torricelli’s formula: v = cV 2gh. 
The empirical constant c would be unity if there were no obstruction and no 
“vena contracta ’’ near the orifice; for ordinary small holes with sharp 
edges it is 0.6. Thus 


acV 2ghdt = — Adh 


or 
5 = —c— V 2gdt 
On integrating this we have 
Vh = Vh — ; ` Vagt 


where the constant of integration has been so adjusted that h = ho at 
t=0. 


c. Heat flow. 
When heat flows through a body the temperature, 7, is in general a compli- 
cated function of the coordinates within the body. In simple cases, how- 
ever, it may depend only on a single coordinate, x (distance from a heated 
plane, or distance from a point source of heat). In that case, the rate at 
which heat crosses an area, A perpendicular to z is given by 
aT 


R= -kA an (2-1) 
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and R is constant because of the continuity of flow. The quantity k is 
known as the thermal conductivity. 
(a) If the body is a slab with plane parallel faces, one of which is 
maintained at a temperature Tj, integration of (1) leads to 
B Re 
kA 
x being the distance from the heated face. From this one obtains the 
elementary relation 
Tı — Tə 
d 


for the heat transfer across a plate of thickness d. 

(8) If a heat source is placed at the center of a sphere, the temperature 
isa function of r alone. Here A = 477? and (1) reads — 4rkr?(dT /dr) = R, 
which gives 


R=kA (2-2) 


In this case, the temperature is not a linear function of the distance from 
the source as it was in (a). 

(y) At constant external temperature the thickness of ice on quiescent 
water increases as the square root of the time. To show this we write (2) 
in the form 


where x now represents the thickness of ice and dH the quantity of heat 
transported away from the lower surface of the ice in time dt. This, how- 
ever, is proportional to the thickness dr which is added on to the already 
existing layer in time dt. Hence dz/dt = C/z, C representing a constant. 
From this it follows by integration that 


~t 


d. Salt dissolving in water. 
When zo grams of salt are placed in M grams of water at time ¢ = 0, how 
many grams will remain undissolved at time 1? The rate of solution, 
dx/dt, is proportional, (a) to the number of grams, x, undissolved at time £, 
(b) to the difference between the saturation concentration, X/M, and the 
actual concentration, (zo — x)/M. (X is the number of grams of salt 
that would produce saturation.) Thus 


2 (4-233 -EZ-a 23) 
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To solve, we write 


dz 1 ($ _ dz )- k di 
(X¥ - metre X—-a2\c X-x te M 
Integration then leads to: In Zatz +e = a kt. When 


the constant c is adjusted so that z = xp att = 0, the result is 
(X — zo + z)% X- XH 
Iin —__--—— kt 

n 2X M 


. . 1 1 . . 
If 2 = X, then the solution is tom (k/M)t, as one may easily verify 
0 
by going back to equation (3). 
e. Atmospheric pressure at any height. 


The increment of pressure between two points in the atmosphere differing 
in height by dh is dP = —pgdh, if p is the density at height h. But p is 
related to P by the expression Pp’ = Popo ’, which is valid for adiabatic 
expansion of air if y is taken to be 1.4.7 The quantities Py and po are the 
sea level values of Pand p. Therefore 


P i/y 
= — (=) pogah 
xt y — 1 pogh 


and this, on integration, gives GY =]— >? the constant 
Y (i 


of integration being adjusted so that P= Poy at h = 0. 


f. Homogeneous gas reactions. 


Chemical reactions involving but a single phase are said to be homogeneous. 
Among these there may be distinguished unimolecular, bimolecular, ter- 
molecular reactions and so on. In the unimolecular case, the number of 
molecules undergoing a chemical change is at any instant proportional to 
the number of molecules present. The decomposition of nitrogen pentox- 
ide into oxygen and nitrogen tetroxide (2N2O0s — O2 + N2O,) is an exam- 
ple of this kind, the differential equation being similar to that describing 
radioactive decay (Example a). 

In a bimolecular reaction, of which there are numerous examples, sub- 
stances A and B form molecules of type C. If a and b are the original 
concentrations of A and B respectively, and z is the concentration of C ata 


given instant, then 
= = k(a — 2)(b— z) 


2 is the ratio of the specific heat at constant pressure to that at constant voluine. 
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: . . . I . 
To integrate this equation, the expression G@-ne-n is resolved into 
1 1 


— ] . We then have 


the partial fractions > l 
—~blb—x a-r 


aS Gii) 


whence 
1 a 
ab P “Kite 
. 1 a 
Since z = 0 att = 0, c = ——~ In - , so that 
a—b b 
b(a — z) (a—b)kt 
a(b — z) 
From this, the reaction rate is seen to be 
1 bla ~ x) 


k= Gb a(b — x) 


The concentration of substance C is 
a(l — elat kt) 


a 
j-—— ox!) 
( b° 


When the original concentrations a and b are equal, the expression for k 
becomes indeterminate, but on putting b = a + e and letting € approach 
zero, an expansion of the logarithm yields 


lz 
ala- z 
which is also seen to be a solution of the differential equation 
5 = k(a — x)” 


Other types of reactions will be dealt with in the problems on p. 40. Asto 
terminology, we note that a rate law for multimolecular reactions of the 
form 


SE L (oy — aM (a — 2) > (ay — 2) 


is often said to describe a reaction of the n-th order, where 


8 
n= dn 


2.2 ORDINARY DIFFERENTIAL EQUATIONS 38 


g. Clapeyron’s equation. 


Any phase change of a substance which takes place at constant pressure and 
temperature conforms to Clapeyron’s equation: 


dP l 
aT T(V; -7V 


Here l represents the latent heat of the process, V; and V; the volume per 
mole of the final and the initial phase respectively, and P the pressure. 
This equation may be applied to the process of sublimation, yielding an 
approximate expression for the vapor pressure as a function of the tempera- 
ture. In that case J, the latent heat of sublimation of the solid, is nearly 
constant over a range of temperatures, and V;, the volume of the solid, 
may be neglected in comparison with that of the vapor, V;. The vapor, 
though not a perfect gas, will be taken to satisfy V; = RT/P. Clapeyron’s 
equation then becomes 


aP IP 
dT RT? 
which on integration gives 
P = coer 


an equation often called the Clausius-Clapeyron equation. This result is 
found to be valid over small ranges of temperature, for the vapor pressure of 
both solids and liquids. A more refined result may be obtained by intro- 
ducing for l a more adequate approximation. 


h. Centrifuge problem. 


When a cylinder of height k, filled with fluid, is rotating about its axis, the 
pressure within the fluid will not be constant but will depend on r. Con- 
sider a cylindrical shell of fluid of thickness dr, the surfaces of which are 
coaxial with the rotating vessel. The net force pushing inward on this 
shell is 2rrhdP. This must equal the centripetal force due to the angular 
speed w, namely mw*r, where m, the mass of the fluid, is given by 2arhdr - p. 
Hence 
QarhdP = Qxrhpdr - wr 


(a) If the fluid is a liquid, the density, p, is constant and the solution is 
P= pwr? + P 0 
(8) If the fluid is a gas, P = cp (since PV = const.), the solution is 


P = Pode 
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i. Soap film. 


If a soap film is stretched between two circular wires, both having their 
planes perpendicular to the line joining their centers, it will form a figure 
of revolution about that line. At every point such as P (cf. Fig. 1) the 
horizontal force acting around a vertical section of the film is the same. 
Hence 


2ryT cos 0 = const. 


P 


Fra. 2-1 


where T is the surface tension of the film. But 


_ dy T° 

cos @ = [1+ (2) 
27-1 /2 

QT" 


T being a constant. Solving for the derivative, 


so that 


which leads to 


The constants c and cı may be expressed in terms of the distance between 
the wires and their radius. The longitudinal section of the film is seen to 
be a catenary. 

The examples above seem sufficient to illustrate the method under dis- 
cussion. The problems leading to separable first order equations are very 
numerous. 


Problems. 
a. Helmholiz’ equation. 


Tf a circuit has resistance R and inductance L, the current J in it obeys the differential 
equation 


dI 
Lagt =E 
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where E is the impressed or external electromotive force. Show that the growth of a 
current (E = const., J = 0 at t = 0) is described by 


l= z (O — e~B/Dty 


and the decay (E = 0, I = Ia at t = 0) by 
I= IgE Dt 


b. Solve the equation for termolecular reactions: 


dz 
a = Re — r) — 2) — 2). 


e~d a-e -a 
(a — 2) Q — 3) € — 2) = eD ae) bakt 
a b c 


Ans. 
c. Solve the equation for opposing unimolecular and bimolecular reactions: 


dz 
z 7 ky(a — z) — kar? 
under the condition x = 0 at t = 0. 
a = ka l 2 kı 1 kı 
Ane. a 7A coth Akat + % where A -2(0+52 
Show that, when equilibrium is established (t = œ), 
z kı 
a~r kz 
“d. Solve the equation for consecutive unimolecular reactions of the type 
ky ka’ 
A—> BC 
that is, 
dn: dn: 
= Kini > = kmi — km 
ke ky 
Ans. = 1- med a) 
ne ns (atm +n) bh? a 


where nz = amount of C present at t. 
e. A projectile is fired vertically into the air with initial velocity V. (1) Find its 
speed at any height; (2) find the time at which it will have traversed a distance r. 


Note: the differential equation to be solved is 
dv dv gR? 
<mo m 
7 


dt dr 
where g = acceleration due to gravity, R = radius of the earth. 


[r-a] 


Ans. (1) 
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‘(2) EV? > YR 
Dp RNY? 
t= (V? — 2R) 1G — YR + "ey - r} 


2g RINY? 
YR? (v — 29R + aaa ) + (V? — 2gRyva > 
a cen sre PRPS 1 — 
(v2 — QgR)il2 n y + (v2 — 2gR) 2 +3 in R 


2.3. The Differential Equation is, or Can be Made, Exact. Linear 
Equations.—A differential equation, written in the form 


Adz + Bdy = 0 (2-4) 


where A and B are functions of z and y, is said to be exact if the left-hand 
side is an exact differential. The necessary and sufficient condition for this 
to be true was shown in sec. 1.7 to be equivalent to the Cauchy relations 


The equations considered in the foregoing section, where A was a function 
of x alone and B a function of y alone, are exact in the trivial sense that 
aA/dy = ðB/ðr = 0. 

Differential equations occurring in practice are rarely exact, but every 
equation of the form (4) can be made exact and then integrated. The 
device for doing this is to multiply it by a suitable factor known as the 
` integrating factor. For instance, the equation 


¥+(-2)a =o 
y oY 


is not exact. It becomes exact on multiplication by zy. For it then takes 


the form 
d (zy ~ = 0 
3 ) 


23 
ry — 37 const. 


which has the solution: 


While an integrating factor exists for every equation of the form (4), it 
is not always easy to find. If the equation is linear, however, that is if it 
can be written. 


M+ sey = g) (2-5) 


. . . Jdx . gs 
an integrating factor is always available. It is e s . On application of 
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this factor eq. (5) becomes 
d 
© (ye?) = ga) 


where the abbreviation F(x) = f J(E)dg bas been used. The solution is, 


y =e" | f Aoz + e] (2-6) 


This result is most useful, for the occurrence of linear equations is very 
frequent. 


clearly, 


Examples. 

a. Circuit containing inductance and resistance (Helmholtz’ equation). 
This problem has already been discussed, but it may be instructive to solve 
the differential equation also by the method of eq. (6). We have 

dI RI E 
— p 2-7 
ao LL 27) 


Thus 


so that 
E 
I = ibe [fz e(RIDtge + e| _ z p og (RIL 


and this agrees with our previous result (Problem a). 


b. Circuit with inductance and resistance; variable electromotive force. 


The present method involves the solution of eq. (7) when E is a function of 
the time, in which case the equation can no longer be separated. Let us 
assume that 


E = Epo sin at 
We then have 


43 EXACT DIFFERENTIAL EQUATIONS 23 


Hence® 
g RIL 
=- Ey fe (RIDE gin wtdi + e RIDE 
E 
= T 7 zW sin wt — w cos wt) + ee PIi 
w 


where w’ has been written for R/L, a quantity having the dimensions of a 
frequency. To fix the constant we assume that (0) = 0, in which case 


Eo 1 v 
I= Lgo 5 (w" sin wt — w cos wt + wet) 


The last term represents transient currents which disappear as scon as 
1 


t> 
@ 
c. Radioactive decay of mother and daughter substances. 


Let A be the number of atoms of the mother substance (e.g., UI) and B 
the number of atoms of the daughter substance (e.g., UX ) at time t, Ao 
being the original value of A at t = 0. Let X4 and àg be the decay con- 
stants as defined in sec. 2.2a. The two substances satisfy the two differ- 


3 Here and elsewhere, there occurs the integral f e*”' sin widt. This is easily 
evaluated if the sine is written as an exponential: 
1. . 
sin x = — (e7 — e%*), 
2i 
Thus 


f et gin widt = Z f jeler tint _ elof io tde 
i 


1 {— n) ee pe _ iwe. — le + ioe | 


= 2i w tw aw — ie 2i w2 +a? 
ert eet 
= Faq sin wt — w cos al) = ~ Gat at 2 08 (wt + B) 
o! 
B = tant = 


Similarly : 
ert 
fo? co ota = -z (o' cos wt -+ w sin ot) 
wo? +w 


a't 


= qipan (wt + 8) 
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ential equations 


dA = AA; 


B . 
— =- AA 
di WwB +a 


dt 


When the solution of the first, A = Aoe4', is substituted in the second 
there results 


B 
S + Ag B = Aa Aoi ™at 


an equation which is linear in B and can be solved by formula (6). The 
solution is: 


`a hat —Àpt 
= AE B 
Xe Z ha Aole ) 


if we assume that B(0) =0. Note that B will reach a maximum at time 
mM- In dp 


Aa — Ap 
Problem. A circuit contains capacitance C, resistance R, and is subject to ar 
electromotive force Æ. Calculate the instantaneous value of the electric charge q on the 
condenser, noting that it satisfies the differential equation 
d 4 
Buton=# 
Ans. For FE = Eo sin ut, 


8 ` 1 
a= g aya snot aco ty a), oe! = 


RC 


2.4. Equations Reducible to Linear Form.—Of some mathematical 
interest is an equation of the form 


Ë + flay = sa (2-8) 


because it can be made linear by the substitution y = ult», This con- 
verts (8) into 


4 (1 nfu = (1 nyg 


which can be solved by the method of the 


x preceding section. Eq. (8) is 
often called Bernoulli’s equation. 
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2.5. Homogeneous Differential Equations.—A first order equation is 
said to be homogeneous’ if, the equation being written in the form 


Adz + Bdy = 0 
A and B are homogeneous functions of the same degree, i.e., 
A (iz,ty) = A(z); Baty) = B (zy) 
H this is true we can substitute y = va, obtaining 
A(xy) = A (zapr) = "A (1w); Bay) = r~B(1,r) 
The original equation, 


dy _A 
da B 
is converted into 
dy A(1,v) 
v+ a =- Bap) = f(v) 
by this substitution, and. this equation is separable, yielding 
_ dw _ #2 
far rz 


Example. Lines of force. 
An equation closely related to the homogeneous type, and tractable by the 


4 A remark on the use of the word “ homogeneous ” in mathematics seems in order, 
for the term is used with several different meanings in different contexts. The following 
definitions correspond to the chief usages. 

1. Homogeneous function: f(x1,%2,- - -t,) is said to be homogeneous in all its vari- 
ables if, for any parameter, t, f (i£ t22, + iln) = tf (21,22 * tn). «is the “ degree ” of 
the homogeneous function. 

2. Homogeneous equations: A set of simultaneous linear algebraic equations of the 
form. 

È asn =c; J= 1, 2,- 
i=] 
in which the a’s are constants is said to be homogeneous if all c’s are zero. 

3. Homogeneous differential equations: (Two usages of the term!) 

a. A first order equation of the form Adz + Bdy = 0 is said to be homogeneous if 
A(z,y) and B (x,y) are homogeneous functions of the same degree. 

b. In general, F(z,y,y',y’’,- -) = 0is said to be homogeneous if F is a homogeneous 
function of y and all its “—_! not neea of x. Thus 


Hee file)-y = 0 


da} 
is homogeneous and linear, If the right-hand side of this equation were not zero but 
equal to a function of x, the equation would still be linear but no longer homogeneous. 


Jn ao Z4 Ín- e) Z 
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substitution here described, is the differential equation for lines of force. 
A line of force is defined as that curve which is tangent, at every point 
through which it passes, to the force at that point. The present analysis 
is applicable to attracting mass points, attracting or repelling electric 


aj? 
To 
P (xy) 
Ti 
OUA 
Fic. 2-2 


charges, and magnetic poles. Let it be desired, for example, to find the 
lines of force due to two charges, qı and q, a distance 2a apart. (Cf. 
Fig. 2.) If we restrict our consideration to the plane containing the charges 


and the point P, then, for every point in this plane, the definition of a line . 
of force requires that 


g 
F, putot y ~ a) 
=a a a (2-9) 
z gı G2 
att Bt 


2 


FIS 


If a were zero, this would reduce to dy/dz = y/z, an equation which has 
for its solution all straight lines through the origin. These, as is well 
known, represent the lines of force due to a point charge. In general, 
however, eq. (9) reads 


z ledy — (y + a)dz] + 3 [zdy — (y — adr] =0 (2-9a) 


This equation misses being homogeneous by the presence of the quantity a. 
But a simple artifice will help. If we introduce two new dependent 
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variables, yı = y +a and yo = y — a, so that dyy = dya = dy; ri = 
(x? + yf), ro = (£? + 12)? eq. (9a) takes the form 
4 zdy, ~ yydx +¢ zdya — ydr _ 
+ yy? + aye? 
each part of which is homogeneous. Now put yı = 112, yo = vex so that 
x?dv = zdy — yd 
The result is then simply 
dv, diz 
q (+ rl? + Qe a+ 92)3/2 = 0 


When this is integrated, we immediately obtain the equation of the lines 
of force due to the two charges: 


v 
Z uu | Bye const. 


. vı ~~ = 
1TA TETEA n t n 


2.6. Note on Singular Solutions. Clairaut’s Equation.—A first order 
equation of degree higher than the first may have a special kind of solution 
which is not obtainable by specifying the constants in its general solution. 


Thus consider 
-2% (2 
y =x x + (2-10) 


This equation may be solved by the following artifice. Differentiate once 
more, thus converting it into a second order equation, which, however, can 
easily be handled by the methods already discussed. The result is 


dy dy dy dy dy 
os La ~~ 1 gt LS 
de dn de® de da 


or 


dy\ dy 
—~)j-4 . 1 
(« +2 dx} dx? (2-11) 
If now the first factor be cancelled, the equation is 
dy 
a ~° 


and has the solution y = cı% + c2. This, however, is too general a result 
since it contains two constants of integration, a circumstance brought 
about by the arbitrary procedure of converting the original first order 
into a second order equation before solving. To satisfy eq. (10), it is 
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necessary to substitute this solution and adjust cz in conformity with its 
demands. It is then seen that cz = ci, and 

y= ce + c* 


is the general solution of eq. (10). 


But eq. (11) can also be satisfied by equating the first factor on the left 
to zero. This leads to 


2 
z+ 2m =o, or y=-7+e 
This will satisfy eq. (10) if c = 0. Thus 
_2 
y 4 


is another solution of the original differential equation, but one which is not 
derivable from its complete solution. It is called a singular solution. 
Inspection will show that it represents the envelope of all the straight lines 
which correspond to the complete solution. This is generally the meaning 
of singular solutions. 

An equation of the form 


y = elt #) 
yuu A 


is known to mathematicians as Clairaut’s equation. Eq. (10) is a specimen 


of this type. Clairaut’s equation can always be handled by the method 
here used and has the general solution 


y= ce + fle) 


he 


EQUATIONS OF HIGHER ORDER 


A general method for solving certain differential equations of higher 
order will be presented in secs. 2.10-12. It seems appropriate, however, to 
discuss first a few special types of differential equations which can be solved 
by elementary means. While the theory given in this section is applicable 
to equations of any order, emphasis will be placed solely on second order 
equations because of their prominence in mathematical physics. 

2.7. Linear Equations with Constant Coefficients; Right-Hand Mem- 
ber Zero.—In discussing this type of equation it becomes convenient to 
mtroduce a new notation; we write D = d/dr. A symbol such as D, 
which is meaningless unless applied to a function of x, and which is there- 
fore not a mathematical quantity in the usual sense, bears the name 
“operator.” In the present. connection D may be regarded as nothing 
more than an abbreviation. Later, however, when the mathematics of 
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quantum mechanics is to be studied, it will be found that operators such as 
D are entities of considerable significance which give rise to an operator 
algebra quite different in many respects from ordinary algebra. For the 
present we merely observe that a differential equation of the type under 
discussion in its most general form may be written: 


Dy + aD ty + Dy +--+ any = 0 (2-12) 


The a’s are constants; the order of the equation is n. Consider now the 
differential equation 


(D — 1)(D — r2) +++ (D — rajy = 0 _ (2-18) 


. which must be understood to mean that the successive application of 
d/dz — tn, d/d£ — fa-1, ete., upon y is to yield zero, the r’s being constants. 
It is clear that (12) and (13) become identical when the r’s are chosen to be 
the roots of the algebraic equation 


r + ari + aor??? +--- +a, = 0 (2-14) 


Let us then attempt to solve (13). A particular solution of that equation is 
easily found, for if y satisfies ` 


(D — ra)jy = 0 


it will also satisfy (13), since further differentiations and multiplications by 
r will leave the right-hand side unchanged. But (D — r,)y = 0 has the 
solution y = c,e"”, hence this is a particular solution of (18). 

Furthermore, we observe that the order of the “factors” (D — r;) 
appearing in (13) is insignificant. Hence any factor may be written last, 
and this means that ¢,_,¢"~!* is also a particular solution, and soon. On 
adding all particular solutions, i.e., on putting 


y = Lae’ (2-15) 


there results a solution with n independent arbitrary constants, and this 
must therefore be the complete solution. To summarize: in order to solve 
(12), first determine the roots of (13), which is known as the auziliary 
equation. If these roots are denoted by r: the general solution is (15). 

One point is to be noted. If the coefficients a appearing in (12) are 
functions of v, the decomposition into factors leading to (13) cannot be 
made by solving the auxiliary equation. The reason is that then the r’s 
will also be functions of x, and 


(D — 7)(D — ra)y = (D — r2)(D — ri)y 


as the reader may easily verify. This state of affairs is expressed succinetly 
by saying that the operators (D — rı) and (D — 72) are commutative only 
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if the r’s are constants. For variable r’s the order of the factors in (13) 
is also essential, so that the whole method of sakution here discussed must 
fail. 

Returning to the case of constant coefficients, one minor difficulty must 
be considered. Suppose that two roots of the auxiliary equation are equal. 
If they are called r, the supposedly general solution will contain the part 
(c1 + ¢g)e"* which is equivalent to ce™*. One arbitrary constant has been 
lost and the solution obtained is no longer complete. To remove this 
fault we consider the two factors of (13) which gave rise to it and study the 
equation 

(D — )?y = 0 (2-16) 


One solution is certainly y = e7. Let us look for a general solution of the 
form y = f(z). On substitution of this into (16) there results the 
following differential equation for f(z): 

df 

2 = 0 

dz? 
Hence f = cx + co, and the complete solution of (16) reads 

y = (ar + e)! 


This shows that, when two roots of the auxiliary equation are equal and 
have the value 71, the part of the solution (c1 + cz)e™* occurring in (15) 
must be replaced by (ev + c2)e™. An extension of this argument leads 
to the general result: If r; is a g-fold root of the auxiliary equation, the 
complete solution of (12) is 


= yee ye $f ope™ H o cll + aye + aga? ++ fag ya tee 4. 
Examples. 
‘a. Simple harmonic motion. 
When the force on a particle of mass m moving along the y-axis is equal to 
—ky, Newton’s second law of motion reads: 
mod 
at? 
Here k, the force per unit of displacement of the particle, is known as the 
stiffness of the oscillator. If we denote the positive constant k/m by o”, 


the equation becomes d*y/d#? + wy = 0. The roots of the auxiliary 
equation r? + w? = Oarer, = iw, rz = —iw. Hence by (15) 


y = eet + caet 


The constants cı and cz may of course be complex. This result may be 
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written in two other, but equivalent, forms. On expanding the exponen- 
tials in sines and cosines we obtain 


= (c1 + c2) cos wt + (c1 — cg)i sin wt = Ci cos wt + Co sin œt 
This last result may also be stated as follows: 
y = Asin (wt + 8) = A’ cos (wt + 8) 


where the new constants A, 4, and A’, 6’ are related to Cı and C2 
by A sin è = Cy, A cos è = Cz; A’ cos 8’ = Cy, —A’ sin 8’ = Cy, or con- 
versely A? = A’? = C7 + C$, 5 = tan? Cy/Co, 6’ = tan Co/C\. 


b. Chain sliding over a smooth peg. 
The chain (ef. Fig. 3) is sliding over the peg, the 
right end moving downward. Let the displacement 
of this end from 0, the point it would occupy in equi- 
librium, be y. If the linear density of the chain is X, 
and its total length l, the mass to be accelerated | | 
is JA. The resultant force is 2dyg. Hence, from 
Newton’s second law, 
2 


dy dy 
Id J2 T 2\gy, or T 7 Y7 0 


<> 


The auxiliary equation has the roots +V 29/1, leading 
to the general solution y = cre ®t + ege VE, Fie. 2-3 


The constants may be fixed by supposing that, when ¢ = Q, y = ‘Yo and 
dy/di = 0. Then c + e2 = Yo; & ~ ce = 0; and. 


y = 5 (eV att 4 eTA = yo AVET 


c. Damped simple harmonic motion. 
When the motion of the oscillator considered in example (a) ix damped, 
there is present, besides the restoring force — ky, a damping force proper- 
tional (at small velocities) to —l(dy/dt), the negative sign indicating that 
the force retards the motion; lis known as the damping constant. The 
differential equation aie the motion is 


A Y4 ap m+ ay = 0 (2-17) 
if b is written for the constant quantity 1/2m. The auxiliary equation has 
the roots —b + Vb? — w? so that the general solution becomes 


y = ege OTN wt + cge OV wt 
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To adjust the constants in conformity with physical conditions we suppose 
that, at £ = 0, y = yo and dy/dt = 0. Then with the use of the abbrevia- 


tion R = Vb? — w? 


__ YO ot LAW? (:-3) =] 
y= Be [+2 +(1-5)e (2-18) 


Several special cases are of interest in this connection. 

(a) b >w. Ris then real, but smaller than b. Hence both terms of 
(18) represent an exponential decrease. The motion is not oscillatory. 

(8) b =. Then R = 0, and y = yoe?4(1 + bt). The motion is not 
oscillatory; it is said to be critically damped. 

(vy) 6<w. Then R is imaginary and may be written R = i’, 
w? = wW? — b?, Eq. (18) now reads 


b. 
y = yoo (cos at + sin ot) 
w 


or, in equivalent form, 


y = S yoo sin (a't + 8) 


w 


where 5 = tan`? w’/b. This represents a damped sinusoidal motion of 
period T = 2r/WV«* — b’; the amplitude decreases exponentially as ¢~* 


d. Natural oscillations in an electrical circuit. 
In a circuit containing £, L, and C, the sum of the “ partial ” electromotive 
forces due to inductance, resistance and capacitance equals the external 
em.f. If the latter is zero (natural oscillations) we have 


al q 
Le tRtga=0 


or, remembering that I = dq/dt, 

d4 Edag 1 

d t La tet” 
This equation is of the form (17); the constants are b = R/2L, 
w = (LC), The solutions are already given in the foregoing example. 


In particular, if oscillations are to take place, œ > b, ie., 2V L/C >R. 
In that case 


RPO nE 
a= (1- FF) ora a(d ~ Gait) 


0 
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=t 
an J -1 


The initial conditions here are that att = 0, the condenser has a charge 
go and there is no current. 

2.8. Linear Equations with Constant Coefficients; Right-Hand Mem- 
ber a Function of x.—We now restrict our considerations to differential 
equations of the second order. In terms of the notation of the foregoing 
section, the problem is to solve 


(D? + aD + a)y = f(z) (2-19) 


If the roots of the auxiliary equation are rı and re, this equation takes the 
form 


and 


(D —)(D — ra)y = f(z) (2-20) 


Put (D — re)y = u, so that (D — r1)u = f(x). This is a linear first order 
equation which can be solved by the method of sec. 3. It gives 


u = et f ef (adr + ce = e (p(x) + a) 


zì 
if we define f e7 F(£)dt = (£). If this is substituted back into the 
0 


definition of u, the result is (D — re)y = e™ (ẹ(x) +1), an equation 
which may again be treated in accordance with formula (6). Hence 


y = e”? f ef)? (a(x) + cilde + coe 


= ete f 


On changing the meaning of the constant cı, we write the solution of (19) 


1 
ent + eae"? 
—™ #2 


y =e f el) (x) da + oe” + ce” (2-21) 

The form of this solution is interesting. The last two terms are identi- 
cal with the solution of the homogeneous equation. They are called the 
complementary function, while the remainder, e°” fe (x) 2 (x) dz, is 


known as the particular integral. Thus the “inhomogeneity ” of the 
equation, f(z), makes its appearance in the particular integral only. It is 
sometimes possible to find the particular integral of an equation like (19) 
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by inspection, that is, by selecting any function which will satisfy the equa- 
tion. When this is available one can make use of the fact just noted and 
form the complete solution by adding to this function the general solution 
of the homogeneous equation. Usually, however, the straightforward cal- 
culation of the particular integral is hardly more difficult. 

The particular integral can be written in a form which is often more 
convenient in practice. On performing a partial integration we find 


(ri—ra) x (ri—n)z 
e z e 
Seea = eet) {f= de hy 
1 — T2 Ti — T2 dx 
e (rr) z ere 
= feaa- f- raa 
Ty — T2 ri— rT 


because dg/dx = e~"*f(x). The particular integral then becomes 


le f ENS (n)dx — ef f emy(aar} 


Ti — T2 
and finally 
== fes f ef (x)dx — e7 f ef (2)ax| + ce" + ene 
1 — 72 
(2-22) 
Examples. 


a. Forced oscillations of a mechanical or electrical system. 
The equation to be considered is (17) but with a function of t instead of 
zero on the right. In most applications this function, which represents the 
impressed force divided by the mass of the oscillating system in the mechan- 
ical case, is a sinusoidal function of the time. Hence we are dealing with 
the differential equation 
dy 


dy dy 
dt? 


y + 0*y = fo sin at (2-2) 


+2% 


As in sec. 7, example (c), the auxiliary equation has the roots 
r= =b HVE — w*, n= -b= Vb — wo? 


If again we denote Vb? — w* by R, the particular integral is 


eB paea 
f e™—Pif sin atdi — f ePtRity, sin atdi 


P.I. = 


The integrals here may be evaluated by means of the formulas on p. 48 
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When this is done and the terms are suitably collected, 


p= 2 7-2 ls I 
~ TRILO- RP +e b4 RFF A TA 


lore rera] 4 
fo 


= CETE { (a? — of) sin at — 2ba cos at} 


To obtain the complete solution we must add to this the solution of (17). 
Hence 


_ fo 
lo? — a2)? + deeb? 


y { (a? — a?) sin at — 2bæ cos at} 


+e (ae + ee) (2-24) 


It is seen that the complementary function decays exponentially with ¢ 
and will be damped out eventually. It is therefore of little interest in 
physical applications. The amplitude of the oscillations, 


fo 
la? — aê)? + 4a?b? 


has a maximum when the impressed (angular) frequency has the value 
a = (w? — 22)? 


This is said to be the condition of resonance between the impressed force 
and the vibrating system. If b is zero there occurs what is sometimes 
referred to as the “ resonance catastrophe,” for in that case the amplitude 
is infinite when a = w. 
(a) Mechanical system. 

The present theory can be applied, for instance, to a mass m held in equilib- 
rium by a spring of stiffness k and damping constant J. We then have, as 
in sec. 7c, 


Resonance occurs when 
k p \ 1/2 
= (= T za) 
(8) Electrical system. 


For an electrical system with an impressed electromotive force Fo sin af we 
have (ef. sec. 7d), b = R/2L, w = (LC)? fo = Eo/L. Resonance 
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1 R2\ 12 
- (72) 
The solution (24) represents the charge, q, residing on the condenser at any 


instant. The current J is obtained by differentiating q with respect to the 
time. Both terms in braces then become positive, and 


occurs when 


I = A[(&? — aê) cos at + 2ba sin a] 
where A stands for Eoa/L{(w? — a”)? + 4a]. The power expended 
in the circuit is J ElIdt. This integral contains two terms, one with the 


integrand sin at cos at, the other with the integrand sin? at. The first of 
these is 0 provided T is taken large enough to include a great number of 
T 


cycles 27/a, the last gives f sin’ atdt = T/2. Hence the power expended’ 
0 


is 

AbaT 
The part of the current proportional to cos at causes no power consumption; 
it is a ‘‘ wattless ” current which is always out of phase with the impressed 
electromotive force. 


b. Electrical polarization. 

An equation like (23) also describes the response of ordinary matter to an 
impinging electromagnetic wave. A light wave, for instance, which is 
polarized in such a way that its electric vector is along y, when incident 
upon an electron inside a refracting medium, will exert a force equal to 
eEo sin at upon this electron. Here Eo is the amplitude of the electric 
vector of the light wave, e the charge on an electron, æ the frequency of the 
light (assumed monochromatic). fo in (23) is then (e/m) Eo, m being the 
electron mass. The solution is given by (24). y represents the displace- 
ment of the electron under consideration at the time ¢t. This gives rise toa 
dipole of moment ey. By “ polarization ” is meant the dipole moment per 
unit volume of the material, and this is obtained on multiplying the dipole 
moment due to one electron by the number of displaceable electrons per 
unit volume. If this number is N, then the polarization 


Ne’Ey { (u? — a?) sin at — 2ba cos at} 
m (wo? — a)? + 4a7b? 


P= 


Further considerations of a physical nature show how the index of refrac- 


5 See, for instance, Page, L., “ Introduction to Theoretical Physics,” Third Edition, 
D. Van Nostrand Co., 1952, p. 582 et seq. 
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tion and the conductivity of the substance may be deduced very easily 
from this expression for P. 


2.9. Other Special Forms of Second Order Differential Equations.— 
a. An equation of the type 


“3 = S(@) (2-25) 


can be integrated by the method of sec. 8. If this is done, only formula 
(21) is applicable, for the second formula (22) involves the quantity 


Fra. 2—4 


ry ~ Te, which is zero, the auxiliary equation corresponding to (23) having 
equal roots: ry = rz = 0. The solution is 


y= J e(ajde t+ c -+ er = f | fioa fee “+ ey + com 


This procedure is here very artificial, of course, for this result could have 
been obtained directly by integrating (25) twice. 


Example. Suspension bridge. 
Yonsider the part of the cable between A and the variable point P. It isin 
equilibrium under the action of three forces: the horizontal force, H, the 


tension, T, at P, and the weight W of, or supported by, AP, which of course 
need not act at the middle of the segment. Hence we have 


dy Ww 
Tsm = W; Teoss@= H stang sm) m 
= dee H 

This relation is true for every point P, provided W is the load between A 


and P. It is generally more convenient to write the equation in terms of 
w = dW/dz, i.e., the load per unit horizontal distance; w = w(x): 


d'y w(x) , 
dct (2-26) 


where H is, of course, a constant. 
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In the case of the suspension bridge, the load is uniform along z, hence 
w = const. 
Solution: 


— ee + a parabol 
Y = on cT - Cay parabola 


b. Equations not containing y. 
Tf the equation to be solved is 


38) 
dz ” da 

introduce the new variable p = dy/dx. The resulting equation 
P sap) 


can then be solved by one of the methods already discussed. 


Example. Cable hanging under its own weight. 


The equation describing the cable is (26), but w is not constant. In this 
case it is dW/ds, the weight per unit length of cable, which is constant, 
provided the latter is uniform. Put dW/ds =>. Then 


From this dp/W/1 + p? = (A/H)dz, so that 
a À 
sinh] p = H7 +a 
Tf the origin is chosen at the lowest point of the cable, cy = 0, and 


dy A H H à 
= sinh — z; y = > cosh ete = 5 (cosh 22 —1) 


This curve is known as a catenary. 


c. Equations not containing x. 


1 y(n) 
dz? 7 (P Gy 


Again we put dy/dx = p, but now we write 
dy dp dp dy dp 


59 INTEGRATION IN SERIES 2.09 


The resulting equation 
dp 
P y 7 SYP) 


is solved for p, then integrated once more. 

All linear homogeneous equations of the second order with constant 
coefficients discussed in sec. 2.7 can be solved by this method, but the 
treatment of sec. 2.7 is usually simpler. 


Example. Anharmonic oscillator. 
Differential equation: 


d? 
oa + ay + N =0 
Solution: 


d 
pdp = —(w’y +ry”)dy, p= r = (e — ay? — Fay? 


The integration of this equation leads to an elliptic function.® 


Problem. Solve the equation for the anharmonic oscillator by successive approxi- 
mation, assuming that Ny X w. 
Ans. 


ha? 
y = a cos (wt + €) -zaU — 4 cos 2(ot + €)] 


INTEGRATION IN SERIES 


A type of differential equation occurring very commonly in physics has 
the form 


y” + Xy +Xy=0 (2-27) 


where X, and Xo are functions of r, the independent variable. Here and 
in the following, primes denote differentiations with respect to z. The 
methods developed in the preceding sections of this chapter are suitable for 
solving (27) when X, and X; have special forms, but are far from yielding 
solutions of that equation in general. In fact, such solutions are frequently 
not available in closed or finite form. For certain regions of x, however, 
they may be found in the form of convergent series by a procedure to be 
studied presently. 


€ See Peirce, B. O., “Short Table of Integrals,” Third Revised Edition, Ginn and 
Co., New York, 1929. Introductory treatments of elliptic integrals may be found in 
“ Higher Mathematics,” by R. S. Burington and C. C. Torrance, McGraw-Hill Book 
Co., New York, 1939, “ Higher Mathematics for Engineers and Physicists,” by I. 8. and 
E. S. Sokolnikoff, McGraw-Hill Book Co., New York, Second Edition, 1941. 
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2.10. Qualitative Considerations Regarding Eq. 27.—Before turning to 
the consideration of exact solutions of (27), a few remarks concerning their 
qualitative behavior in limited domains of x may be of value. To survey 
their behavior, it is often advisable to remove the first derivative occurring 
in (27), which is always possible by means of a simple transformation of the 
dependent variable. Instead of y, we introduce v, related to y by 


y = ve bf Zida 


When this is substituted into (27) and the exponential factor is then 
cancelled, there results an equation for v: 
o” + (Xa — 4X —iXDv = 0 


from which the first derivative is absent. This represents essentially a 
relation between v and the curvature of v and may be written 


v” =f(x)v 


v (x) 


f(a) 


Fre. 2-5 


One fact is at once apparent: provided v is finite, it has a point of 
inflexion wherever f(x) = 0. Furthermore, in regions where f(x) > 0 
two facts are to be noted: If v is positive and has a positive slope, the slope 
will continually increase as x increases, causing v to grow rapidly; if v is 
positive and has a negative slope, the positive v” will continually diminish 
its steepness, causing v to approach the z-axis and then in general to turn 
upwards again. For negative v the words “ positive” and “ negative” 
in the preceding sentence should be interchanged. This qualitative 
behavior is most easily remembered if we think of the special case in which 
f(z) = const. = œ? > 0. The solution is then 


v = ce? + coe 


which typifies the foregoing remarks. 
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If, however, we consider a region in which f(x) < 0, the slope of positive 
v will be continually diminished. Thus if v starts out with positive slope 
this will soon be zero and then decrease until » = 0; as v then becomes 
negative its negative slope will increase until it is horizontal and v turns 
back toward zero. In short, v is oscillatory. This again is easily remem- 
bered if we consider the special case in which f(r) = —«w? < 0 for it has the 
solution v = c sin (wr + ô). 

Fig. 5 illustrates these facts. To the left of A, v oscillates; at A it has a 
point of inflexion; to the right of A it is of exponential behavior. 

2.11. Example of Integration in Series. Legendre’s Equation.— 
To illustrate the method of series integration, let us postpone fundamental 
matters and start by studying a specific example. An equation of consider- 
able interest is Legendre’s; it has the form 


(1 — 2)y"” — æy + 1+ ly = 0 (2-28) 
in which J is a constant. We attempt to find a solution which is a series in 


positive powers of z. If the lowest power occurring is «x, this solution will 
have the general form 


y = Laat A (2-29) 


Solving the differential equation then amounts to determining the coeffi- 
cients a. Whether the series converges can be tested after this has been 
achieved. At present it will be assumed that this is the case, and that (29) 
may be differentiated term by term. When (29) is substituted in (28) 
the result is 


x ale + Al +A — Dah? — Zall +A) +A- 1) 
+ 2(e +A) — (l+ Die = 0 (2-30) 
This equation must hold for every value of z, and this can be true only if 
the coefficient of every power of x is identically zero. Since A cannot, by 
hypothesis, be negative, the lowest power of x occurring in (30) is 2°, 


and it is present only in the first summation of (30). Thus we find, put- 
ting à = 0 to obtain the term in question, 


agx(x ~ 1) = 0 (2-31) 
dy is the lowest coefficient in our summation and hence not zero. Equa- 


tion (31) therefore determines x. It is often called the indicial equation. 
Clearly, two values of x are permissible: 


c= 0,1 
Next, we see what further information eq. (30) will give. According to 
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the foregoing, the coefficient of z"*? must vanish for every positive integer j. 
Now the term corresponding to the (x + 7)-th power of x is obtained in the 
first summation by putting à = j + 2, in the second by putting A = j. 
Hence 

ajel Hj E2) Hj +1) Hallet) +j+) -1d+ 1) 


or 
_— &tDe+tF+EY -I+ 


t2 Te FG + De t+G+2). 7 


Thus, if a; is given, a;42 can be computed from this relation. Starting 
with ao, (32) permits us to obtain, successively, a2, a4, etc.; ao, however, is 
arbitrary; it is one of the two arbitrary constants appearing in the general 
solution of a second order differential equation. On the other hand, if a, 
is assigned arbitrarily, all coefficients with odd subscripts are deducible from 
(32). 
Choice 1. Let ustake « = 0. Eq. (32) then reads 
(j +1)-ld+1 
IG+1) -1¢+)) (2-33) 


= nn KaT aj 


Qj. > . 
O G+DG++?2) 
On taking ap and a, as arbitrary constants, the solution becomes 
ida@+1 6-— ll 
y - (1 = Hae EED MED a 


(2-32) 


12 2 
2—it+1 2-1¢@+1) 12~10+1 
+(+ EH n ED BOUED da 


4! 


-(1- 4D p ME DAF DEED a 
Wd — 2)---@— Ir +2)04+1)---+2%-1) 4 


+(-1) ani 

++++) a 

+(e- Q- DOF 2) ag a- DOZDCHDUHD a4... 
„&- D-3) Q r+) H2) ay 

+D Qr+! á 

t)ar (2-84’) 


Chm 2. Let ustakex = 1. Eq. (32) then reads 
a, n ADOD UTD 
an G+ 2)G +3) 7 
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If now we take again ao and a as arbitrary constants, we find 


id a a a ne 
t(s , oath) pis) 4 EAD, ae D4. Na 
_ (1 -! -)0+) 4, @- NG a)0-+0-44) ate a 

3! 5! 
te (2- Go Bute f+ SPOOFS) os a 
(2-35) 


The terms multiplying a in (35°) are seen to be identical with those multi- 
plying a; in (34'); hence these two particular solutions are the same. The 
second part of (35°), however, does not agree with the first of (34’), both 
of which represent series in even powers of x. It might seem, therefore, as 
if we had obtained altogether three independent solutions, which is, of 
course, impossible. But closer inspection would show that the second part 
of (35’) is not a solution at all. This is seen at once if, after assuming any 
specific value for 1, we substitute it back into the differential equation. 
The trouble is that, putting x = 1 and ap = 0, we have carelessly discarded 
any constant term which might appear in the sequence. The present 
example indicates clearly that the solution of a differential equation is not 
an altogether mechanical matter and that caution must be used at every 
step. Summarizing, we observe that the significant parts of (34’) and 
(35’) are: 


-[- CED re Seer) ai+.. -+ (-17 
I= 2)-:-@- 2% +2040 Er) a 
(2r)! 
+--+ (2-34) 
~1d+2 PENERE TIET. 
v= |2- q DEED ay )( NEEDED sg.. 
„0-0-3 -r+ H2) oy 
+(-) (2r + 1)! ar 
tefa (2-35) 


Problem. Show that the equation y” -+ y = 0, if integrated in series, has two 
particular solutions. one of which may be identified with the cosine series, the other with 
the sine series. 
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One further point should be observed. When any one term in (84) is 
zero, all succeeding terms vanish also and the series becomes a polynomial. 
The conditions under which infinite series like (34) reduce to polynomials 
are of great importance in many physical problems and will be discussed 
more fully later. 

The work thus far has only established the fact that the series (34) and 
(35) are formal solutions of Legendre’s equation, that is, they would 
satisfy (28) if substituted in it. Whether the solutions are of any interest 
depends on their convergence properties. A series converges if the ratio of 
the absolute values of two successive terms, 


| wire | 
| u; | 


is smaller than unity for large 7. Now this ratio is clearly 


But 


is immediately obtainable from (83). As j— © it becomes 1. Hence 
the condition that (34) and (35) converge is that z? < 1, and this is true as 
long as |z| <1. For values of x in the range —1 < z < 1 our solution is 
a significant one; for other values it fails. Is it possible to construct a 
solution valid for | z? | > 1? This is indeed not difficult. 

Let us suppose that y, instead of being given by (29), has the form 
y = Lac. Eq. (30) will then read 

x 


Lan(« —~A)(«e-A— Doe? 
-Eal =A (AHI lH =0 


x now denotes the highest power occurring in the series. The indicial equa- 
tion is obtained by putting the coefficient of the highest power of x equal to 
zero. Thus 
Kk +1)—Ild+1) =0 
whence 
k=] o -l-1 
As before, the coefficient of «4 must vanish for every positive integer j. 
This implies 
ajalk—- j +2) -j+ = allkin) 
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or, on replacing j by 7 + 2, 
(«je j-H a 
(k= j- 2)(x-j-1)- 4+1)’ 
Choice 1. Let us takex = 1. Eq. (36) then reads 
ana LDU , 
H GDG AWAY? 
If ag is chosen arbitrarily, the series becomes 
-1 _ li — 1G — 2d 3) 
= yi 1~ 2 A on 
yee ( 2a- p" + s@—-HaI-s ”* 
Cat Ca et) d—-iD_ ) 
1 a aan a E 2-37 
(=1) zaa. apye Tje CD 
The series ra obtained by putting ao = 0 is of no interest since it 
violates the assumption, previously made, that «, i.e., 1, represents the 


highest power of the sequence. We shall therefore omit it at once. 
Choice 2. Let us take x = —l — 1. Then 


au, =e CELE DG EIE?) | 
G+ 2AtG+3) 
If again we put a; = 0, there results the particular solution 


(2-36) 


Aj+g = 


y= 4, GADE+2 y G+ N04 2904 8044) 
2(21 + 3) 2-4 (21+ 3)(21 + 5) 


CAD U4 er) er, - 
24 A+ +r T ) æ (2-38) 


The two solutions (37) and (38) are independent, hence their sum repre- 
sents the general solution of Legendre’s equation. It is easily seen to con- 
verge if |x | > 1, unless l has such a value that the denominator of one of 
the coefficients in the series vanishes. This case will be studied shortly. 

We are now in possession of two forms of solution of eq. (28). The first 
(eqs. 34 and 35) converges when |x| < 1, the second (eqs. 37 and 38) 
when |x | > 1. Under special circumstances, however, (34) or (35) as 
well as (87) or (88) may become polynomials, which remain finite for every 
finite value of x. It is interesting to see what happens to the various par- 
ticular solutions when this contingency arises. 

Eq. (34) reduces to a polynomial when lis an even positive or an odd 
negative integer (or zero). 


a. Let l be even and positive; 1 = 2k. (84) then becomes 
— 2)---2¢d+1 1—1 
(1-a. ye Dett DA 


+ 


y=a 
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On the other hand, (87) becomes under these conditions 


1! 
id — 2)---20+1)---(Q— 


These two solutions become identical if the second is multiplied ` 
constant factor 


(—1)!? 


ol- 2)---2b+1)--- (A-1) 
i! 


(—y" 


Hence the particular solution (34) coalesces with (37). 


b. Let 1 be odd and negative. Inspection shows that (34) now be 
identical with (38). 

Eq. (35) reduces to a polynomial when / is an odd positive or an 
negative integer. 


c. If Z is odd and positive, (35) reads 
y =a (2 C2 NEED a. 


3! 
-2 @~ DCU — 3)--- 20 + 2)--. (2-1) 
+ (-)eve aaa 


while (87) becomes 
l-1) 
y =ar (a- AoW? tT 
L ~ 
2-4---(—1)+2)---Ql—-1)” 
These two expressions become identical when the second is multiplied 
the coefficient of its last term in parenthesis. 


+ (- 1) Q-1)/2 


d. If lis an even and negative integer (35) turns into (38). 

Having established these important relations between solutions (3 
(38) we now return to the consideration of (87) and (8). Solutions (i 
and (38) for integral values of J are of great importance in mathemati 
physics. If the constant ao in (37) is chosen to be 


D! (Ql — 1)(2l— 3) -- +1 


HN)? i! 


the resulting polynomial of degree l is called a Legendre polynomial ( 
Legendre coefficient or “ zonal harmonie ”). It is usually denoted by } 
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For purposes of reference we write it down again: 
1-3-5.. (2—1) 
i! 
u- il — I-11) a Mb A = 2) = 8) ae 
2(21 — 1) 2-4(21 — 1)(21 — 3) 
The series here is to be continued down to the constant term. On the 


other hand, (38) with the constant ag chosen to be 2/(2!)?/(22+1)!, 
l being a positive integer, is often denoted by Qz. It is an infinite series: 


Pi(z) = 


—.+-) (2-39) 


ae CEDU?) es 
a= “aan +“ 3@I+ 8) t: 
(i+ 1)---@+2r) ot 
+g a8) bee” + (2-40) 


The following facts will be noted: 

When Z is a positive integer, (37) is a polynomial, but (38) is an infinite 
series. The general solution of (28) is a linear combination of (37) 
and (38). 

When / is a negative integer, (37) is an infinite series, and (38) is a 
polynomial. The general solution of (28) is a linear combination of (87) 
and (38). 

When Z is equal to some positive odd integer, solution (37) degenerates 
into (88). To see this, suppose 21 = 2n ~ 1. There will then appear a 
vanishing denominator in the coefficient of r?” and in every subsequent 
term of (37). To remove these infinities one may multiply the entire series 
by (n — r), which causes all terms of order higher than 7 — 2n to vanish 
while the others remain finite. Hence the series begins with the power 
gi?” = gl, and inspection shows it then to be identical with (38). 
In this case, our method has yielded but one particular solution, and this is 
an infinite series. Procedures leading to a general solution are discussed in 
treatises on Differential Equations.’ 

When Z is equal to an odd negative integer, (88) degenerates into (37) 
in a manner similar to the above. In that case also no general solution can 
be obtained by the present method. 

Having now given a fairly complete mathematical analwsis of the solu- 
tions of Legendre’s equation. we state some conclusions of practical impor- 
tance. In almost all applications (cf. Chapters 7, $, 11) the independent 
variable z appearing in eq. (28) is the cosine of an angle. The functions of 
interest are therefore those which remain finite for all values which r = cos 0 
can assume; these values include z = +1. Such functions exist only when 


7 See Forsyth, A. R., “ Differential Equations,” Macmillan Co., London, 1914. 
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1 is a positive or a negative integer, as we have shown. But when / is an 
integer, consideration may be limited to solutions (37) and (38), because 
the others reduce to these. Moreover, inspection shows that solution (38) 
with Z replaced by — (J + 1) is the same as solution (37). Hence we may 
further limit our consideration to positive values of Z (including 0) and 
retain only (37) as a significant solution. Finally we note that (37) is 
identical with (39). Hence: 

In physio-chemical problems, where x = cos @, the only solution of 
Legendre’s equation which is of practical interest is P,(cos 8). 


Problems. 

a. Prove that, when | is an even negative integer, the expressions (35) and (38) 
become identical. 

b. Prove that, when 2l is an odd negative integer, expressions (37) and (38) become 
identical. 


Differential Equation for Associated Legendre Functions, or Associated 
Spherical Harmonics. 

An equation similar to Legendre’s plays a considerable rôle in mathe- 
matical physics. It is® 


2 
(1 ~ yy” — ay’ + [x +1)- m J =0 (2-41) 


1— 2? 


where / and m are both integers, and has a particular solution: 
a” 
y= (1- Ty? ms Pil) (2-42) 


The other particular solution is related to Q, and is of lesser interest in 
applications. To construct (42) by the method of series integration is 
perfectly feasible, but we shall here use a simpler method based on the 
foregoing results. If P;(x) is a solution of 


(1 — 2*)y” — wy + 1d + Dy =0 
then 


a™ 
qt (x) 
® The equation occurs more commonly in the equivalent forms 


dy dy m? 
= a — —— = 
3 + cot +11G4+1) rol y =O 


Hil] $]; 
HCF + W+D- EG yok 


which reduce to (41) on substitution of cos 9 = z. 


or 
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for which we shall write P{” (x), satisfies the equation 
(L— YP — 20m + PP + EE +1) — mm + DPI” =0 


(2-43) 
as is seen when Legendre’s equation is differentiated m times. Now let 
PY (z) = (1 — 2*)y (2-44) 


and determine, by substituting this into (43), what differential equation y 
will satisfy. After substitution, (43) will read 


(l — ay} (4722? — 2r — Qrz?)\y ~ 4r(1 — ray + (1 Pyy” 
— 2m+1)(1 — 2*)cy’ + 4r(m + 1)z?y + (10 + 1) 
— m(m + 1)](1 — 2*)y} = 0 


If here the special value r = —rn/2 is chosen, this equation reduces to (41). 
We have shown, therefore, that (44) is true with r = —m/2 and hence that 


y = (1 = sR" (x) 


as was asserted. The function P{™, which is a polynomial of degree 1 — m 
and which satisfies eq. (43), is sometimes referred to by physicists as 
Helmholtz function. The function (42) is known as an associated Legendre 
function, or more frequently, an associated spherical harmonic. 

2.12. General Considerations Regarding Series Integration. Fuchs’ 
Theorem.—Before continuing, the reader will wish to know the limits of 
applicability of the method applied in sec. 2.11, and in particular what 
properties of the solution one may read directly from the differential 
equation. First, then, let us ask the question: Will the method described 
in sec. 2.11 always work? In preparation for the answer, we consider the 
differential equation 


y” +y =0 
On putting y = Loar it is seen that 
Lane +A +A Yak? = Ea t 
A A 
The indicial equation, obtained by putting the coefficient of the lowest 
power of x equal to zero, simply reads 
ag = 0 
and does not determine x. Furthermore, 
ajz = —(K+5)(K +9 — Ma; 


so that a, = —do(x — 1)x. Since ao = 0, this means that either x = œ or 
a is also zero. In neither case do we get any sohation at all. 
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Equally instructive is the equation 
, 
y'+%=0 
T 


Its indicial equation yields x = 0. The recurrence relation between 
coefficients is 
i=), 

gti? 
Thus we have apparently determined a solution. But let us apply a con- 
vergence test. Denoting again the terms of the series by u. one sees that 


Qj} = 


lim | Unt | = lim | anti jort = lim (n — 1)n . 


n= œ | un | O po | an |z” n> œ n+i z =ne 


This is greater than 1 asn — œ for every finite value of zr, so that there is no 
range of x at all in which the series converges. Again, the method fails. 

To enlarge our outlook, let us now return to the general form of the 
equation we wish to solve, that is, to eq. (27). As a rule there will be 
values of z for which one or both of the functions X, and X> become 
i:ñnite. If z = zo is such a value, then zo is said to be a singular point 
of the equation. It is at such singular points that the method of integra- 
tion in series may break down. To be more specific, a solution of the form 
y = Eaz — 2o)*** may not exist at singular points zo. 

`A 


In dealing with Legendre’s equation, a power series development was 
attempted about the point to = 0. It succeeded because, after writing 
the equation in the form (27), neither X; = ~2zr/(1 — x”) nor Xə = 
(L+ 1)/(1 — 2”) becomes infinite at x = 0. But the points x = +1 are 
singular points of the equation, and it is for this reason that the general 
solution obtained breaks down at these two points. Again, the two 
equations just considered, y” + 2 °y = 0 and y” + 27*y’ = 0 possess a 
singular point at z = 0, and this is the cause of the failure of the present 
method. 

But while the method often fails if the differential equation has a singu- 
lar point at the place where the power series development is attempted, it 
does not always do so. For instance, the equation 


y" + iy’ ~y =0 
may be developed in the form y = Eart A despite its singularities at 


xz =0. The indicial equation yields x = +1. When the positive sign is 
chosen, the coefficients must satisfy the equation 


[G + 1)? — lja; = 0 
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which is no longer a recurrence relation but serves to deterraine the coeffi- 
cients just as well. For it says that every a; = 0, except for j = 0. 
The corresponding solution is y = aot. For «x = —1 we have 


[G — 1)? — la; = 0 


and this indicates that all coefficients must be zero except that corre- 
sponding toj = 0 and toj = 2. Hence the solution is 


= 21 (ay + agr?) 


The constants dp and a, are arbitrary, which implies that the solution is a 
general one, including y = const. x as a special case. Obviously, then, it is 
important to settle what kind of singularities do, and what kind do not, 
permit an integration in series about the singular point. 

This issue is settled by an important theorem due to Fuchs, which states 
the following: 

If the differential equation 


y” + Xw + Xy = 0 


possesses a singular point at x = zo, then a convergent development of the 
solution in a power series about the point x = x having only a finite number 
of terms with negative exponents is nevertheless possible provided that 
(£ — %)X1 (zo) and (£ — to)’ Xa(£0) remain finite. 
This clearly is true for the equation 
y” +y — y =0 
at £o = 0, but not for 
y” tay = 0 

Thus the results just obtained are accounted for. The proof of Fuchs’ 
theorem is a matter of some length and will not be undertaken here.® 
In conformity with the theorem singularities in X, and X occurring at 
xz =, which are removable by multiplication by the factors (x — 2;) 
and (x — z,)* respectively are called non-essential singularities of the 
differential equation; all others are essential ones.1° AN regular and non- 
essentially singular points are sometimes referred to as regular points of the 
differential equations (German: “Stellen der Bestimmtheit”’). An 


equation which has no essential singularities in the entire infinite complex 
plane is said to belong to the Fuchsian class of differential equations. 


° See, for instance, Schmidt, H., “ Theorie der Wellengleichung,” Leipzig, 1931. 
Whether the point at infinity is an essentially singular one cannot at once be seen 
in this way. To examine it the transformation £ = 1/z must be made. One may then 
show that the point at infinity is essentially singuler if X ız or Xar? become infinite there; 
it is non-essentially singular if 2x — Xiz? > æ or X ext — co; otherwise it is regular. 
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A final remark on the nature of the solutions obtained by the method of 
integration in series is in order. Even if the point at which the develop- 
ment is made satisfies the Fuchs conditions it may not be possible to obtain 
two independent solutions which, when combined linearly with the use of 
two arbitrary constants, will yield the general solution. If this process is 
to produce a general solution, further conditions must be met. Since 
general solutions are not often required in physical and chemical applica- 
tions, this matter will not be considered in detail here.!! We note, how- 
ever, that two independent solutions in the form y; = Ya,(x — xp)" 
and yo = Lia, (x — to)" can always be obtained when the two roots of the 
indicial equation, « and xz, do not differ by an integer or by zero. 


SPECIAL EQUATIONS SOLVABLE BY SERIES INTEGRATION 
2.13. Gauss’ (Hypergeometric) Differential Equation.— 

(2? — xy” + [(L+a+ 8r — vly’ + apy =0 (2-45) 
The parameters a, 6, y are constants, and it will be assumed that y is not 
an integer. Eq. (45) has singularities at 0, 1, and œ, but they are all non- 
essential. On development about z = 0, the indicial equation reads 

cx—~ 1) +y =0 
hence x = 0,1 — y. Choosing x = 0, we obtain the recurrence formula 
sa SEDED ew 

and hence the particular solution 


aß a(e + 1) 8B +1) 
=ajl+—~—s + 
Y | 1-y + 1-2-y¥(7 + 1) 
peet 1)---(@+r— 1)-8@+1)---@+r-—1) 
nyti) +r) 
The series in {} is known as the hypergeometric series. It converges if 
|x|<1. For a = 1, 8 = y it reduces to the ordinary geometric series; 
hence its name. It is customary to denote the hypergeometric series by 
F(a,8,y;z). With this abbreviation, then, this particular solution is 
y= aF (a,8,y;2) 
Next, wetakex = 1 — y. The recurrence relation reads 
tay @ SL IFIF WG -Yrtit 
+ G+)G+2-7) 7 
11 For particulars, see Bécher, M., “Regular Points of Linear Differential Equations 
of the Second Order,” Harvard University Press. 


r -peee 


te (2-47) 


(2-48) 
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When the new constants: œ = a — y +1,8 =B-y+L1y7 =2-y7, 
are introduced in (48) it becomes 
sat = (e+ 76 +9) i 
G+ DG +7) 
that is, it takes the same form as (46). The particular solution corre- 
sponding to (48) may therefore be written 
aE (a = y + l, bB-y+12- y; g) 

We have thus arrived at the following general solution of (45): 

y = AF (aB) + BK la — yt 1 B-yt12—y32) (2 49) 


whose range of convergence is | x | <i. 

There is an interesting and sometimes useful relation between the 
solutions of Gauss’ and those of Legendre’s equation. Let us introduce in 
(45) the new independent variable £, given by 


z = 3(1 ~ ë) 
so that t takes the form 


a-g) TY tate («+84 DI- apy = 0 (2-50) 


This reduces to Legendre’s equation (28) if we specify the constants to be 
asltl, B=-l yal 
One particular solution of Legendre’s equation is therefore 


y = af h +1, -1,1; “ot 

From the fact that this solution, expanded in powers of & starts with a 
constant term it is clear that it must be identical (aside from a constant 
factor) with (34). lu particular, if ¿is a positive integer, it must be Pn 
This happens to be true, as the reader may verify, even with respect to the 
constant factor if P, is defined as in (39). Thus 


Pi) = F ( +1, —4 1; L 7 (2 51) 
An equation known to mathematicians as Tschebyschef’s results when 


in (50) we specialize the constants as follows: 
a= —8 =n, an integer; y =$ 
The equation then reads: 


d? ay 
a= PES E 4 dy m0 (2 52) 


ag? dt 
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Its solution is clearly 


~ — 1/2 _ 
y(t) = AF (1 =n, 3; ==) + z(=) P(n +$,-n+4, 1 
(2-53) 


The first particular solution here written is a polynomial known as the 
Tschebyscheff polynomial, of degree n. If multiplied by the proper factor 
it has the alternative form: 


n(n — 3) n 


n 
T(z) = 27 G -gigt 


2124 
n(n —4)(n— 5) p 
gre o 2-54) 
This development stops with a constant or a term proportional to z. 
The function F(a,@,y;r) reduces to a polynomial when a = —n, 


n being a positive integer, as may be seen from its definition (47). The 
resulting polynomial, which is of degree n, is known as a Jacobi polynomial, 
defined as follows: 
Jn(P,@;r) = F-n, P + N, g; x) (2-55) 

It satisfies the differential equation 

(2? — z)y” +[(1+p)z— dy’ -—nip+njy=0 (2-56) 
in which g must satisfy g > 0. Substitution of a = ~n, B=pt+n, 
y = q into (47) shows that}? 


Jap ge) = 1+ 


< yt” (ptn)\(ptn+1)---pPtn+rA—)) x 
ží (i) glat 1e ata- l) 7 


Problem. Find the solution of (45) about the point z = 1; i.e., find solutions of 
the form 
y = Dae — 1) 
x 
Ans. 
y = AF (a,8,2+8 —y+1; 1—z) +B(1—r) PF (y—8, y—a, 1—a—8 +y; 1—2) 
2.14. Bessel’s Equation.— 
ry” + xy’ + (2? ~ n y = 0 (2-57) 


nis a constant. Since the equation js regular at x = 0, its solution may 
be developed as a power series about that point. The indicial equation 


(x? — n?)ag = 0 


12 Cf, eq. 12—2 for the definition of C} 
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has the two roots x = +n. According to the remarks at the end of sec. 2.12 
we can obtain two independent particular solutions if 2n is not an integer; 
if it is, the method may allow us to determine only one, Taking « =n 
one finds . 


“| x? zt 
y= a ETET EEO E T 
r r? 
+l peepee eet (2-58) 
For « = —n 
-n z? zt 
Y = Gor tta stne et 
gar 
tse a y aat (2-59) 


When the constant ap in (58) is chosen to be?3 1/[2"T (n + 1)], the resulting 
expression 


o (—1) r n-+2h 
y= hei oa p pra tae) 5) (2-60) 


is called a Bessel function of order n. 
When (59) is multiplied by the same factor it becomes J_,(r). Hence 
the complete solution of Bessel’s equation (when 7 is not an integer) is 


y =Adn(x) + BJ_»(2) (2-61) 


Inspection of (58) and (59) shows that no difficulty arises when n is half- 
integral, although the difference of the roots of the indicial equation is an 
integer. But if n is an integer, J_, is no longer independent of Jn. For 
in that case the coefficient of x” in (59) has a vanishing term in the denomi- 
nator, and every subsequent coefficient likewise becomes infinite. Multi- 
plication by the vanishing term makes every term preceding the n-th zero. 
The series then starts with 2" and is seen to be identical (except for a 
constant multiplier) with (58). For integral n, therefore, we have obtained 
only one solution, namely J,(z).1* By choosing the constants A and B of 


18 The Gamma function appearing here is a generalization of the factorial n! which 
is defined only for integers (and zero). Ifnis an integer, T(n +1) =n! In general, 


T(z) = et” ldt; it is easily seen to reduce to n! when z =n. Moreover, this in- 
0 

tegral defines the “ smoothest ” function which takes on the values n! at the integers. 

Cf. sec. 3.2. 


14 The second particular solution fnr integral n is derived in Forsyth, “ Differential 
Equations,” Macmillan, p. 182. 
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(61) suitably, several particular solutions of Bessel’s equation (such as 
Neumann’s and Hankel’s functions) having useful properties may be con- 
structed. They will be discussed in sec. 3.9. 
2.16. Hermite’s Differential Equation.— 
” . Vey! + 2ey = 0; a = constant (2-62) 
The roots of the indicial equation are x = 0, 1; the recurrence relations 
between the coefficients 
des = 2k +7) — 2a a 
FIFA DRAGGED | 
For « = 0 we find the solution 
y= (1-2 e2, a _ Bala — (a — 4) po 


21 n a1 
+ (ay SOB nee Pre. ) 00-63) 
while for « = 1 
y = agr(1 ~ 72D e Men De, 4 
+ cyber p.. ) 264) 


The general solution of Hermite’s equation is a superposition of these. If 
a is an even integer n, (63) reduces to an even polynomial of degree n. 


On choosing for aọ the value 
(—=1)"2 
Ò 

this polynomial becomes 

H, (a) = 22)” — PEEP (ory 
n(n — 1) (n — 2) (n — 3) 

2l 
and this is known as the Hermite polynomial of degree n. If œis an odd 


integer, n, (64) reduces to an odd polynomial of degree n. In fact if we 
choose for dg the value 


+ (21) — (2-65) 


2-an! 
a 
2 


that particular solution also takes on the form H,(z). 


(—1) (n—1) /2 
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An equation very similar to that of Hermite is 
y” + (L— 2? + 2a)y = 0 (2-66) 
For if we put y = e=", so that y” = {(2? — 1)v — 20 + ote ??, 
the equation turns into 
v” — 22’ + 2av = 0 
which is identical with (62). Hence the solution of (66) is simply any 


solution of Hermite’s equation, multiplied by e~*7/*. 
2.16. Laguerre’s Differential Equation.— 


zy” + O — x)y’ + ay =0; a = constant (2-67) 
has a non-essential singularity at the origin. Developing about z = 0, 
the indicial equation has the single root x = 0. Only one solution will be 
obtained, this being of considerable importance in physies. The recur- 
rence relation reads: 
j— a 
aja = 5 
i+] G + 1)? 3 
hence 


y = (1 = or + SE 


+ (apse a4. ‘ (2-68) 


This expression becomes a polynomial when « = n, a positive integer. On 
putting 


ao = (—1)"n! 
and for integral n, y becomes the Laguerre polynomial of degree n: 
nf on n? -l nê (n — i)? —2 
L,(z) = (—1) (z -0 ty ee 
+ (—1)*n 1) (2-69) 


A differential equation at once reducible to Laguerre’s is 
zy” + (k + 1 — zr)y' + (a — kyy =0, kaninteger>0 (2-70) 


It results when (67) is differentiated k times and y is replaced by its k-th 
derivative. Hence a solution of (70) for integral and positive a and k is 


-Č 1(2) = 1 
Y = z nl(z) = m(X) 
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This is sometimes called the associated Laguerre polynomial of degree 
n — k. 

A third function closely related to the Laguerre polynomials satisfies 
the differential equation 


— k? - 1 
a” tay + [n - A543 |u=o (2-71) 


TE 2, (k—1)/2 


If we substitute in this equation y = e v, then v is seen to bea 


solution of 
ww + (k -+1 rW + (n= kw=0 


Comparison with (70) shows, therefore, that v = L(x). Hence a particu- 
lar solution of (71) is 


y = e DLE a) (2-72) 


This function is known as an associated Laguerre function; it is of great 
importance in the theory of the hydrogen atom. We observe that if n in 
(71) were not an integer but any constant a, the corresponding solution of 
(71) would be 


{2 (k—1)/2 d? 
= ad fg Lalz 
y aa (x) 


where La is written for the series (68); provided, of course, that k is a 
positive integer. This solution would no longer be a polynomial in z 
multiplied by e/?, but an infinite sequence. 

2.17. Mathieu’s Equation.—In the previous sections attention has 
been given to differential equations in which X, and X,!° were algebraic 
functions of x. Equations sometimes arise in which these functions 
are periodic. The simplest instance of these is Mathieu's equation, usually 
written in the form 


dèy 
J + (a + 16b cos 2x)y = 0 (2-73) 


where a and b are constants. Its general solution may be obtained by the 
method of integration in series if the substitution 


E = cos? x 
is made, (73) then reads 
d?y dy . 
4&(1 — £) ae + 2(1 — 2%) dé + (a — 16b + 32bé)y =0 (2-74) 


15 Defined by eq. (27). 
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This equation has a n meessential singularity at £ = O and can therefore be 
nones init, or 
developed as a power series about the origin, On inserting ° 


yo Doe? 
A 


in (74) we obtain 


2E (x PA 4 2K > Dmg - Ede + AY — a + blag 


a 


+ BAY age = 0 
A 


Here a feature arises which was not encountered before; the equation 
contains three different summations instead of two and will therefore lead 
to a three-term recurrence relation between the coelhcients a, instead of the 
two-term relations that occurred in the former instances. This, however 
requires no modification of procedure, except that it will force us to advance 
step by step in the computation of the coefficients. Only the first summa- 
tion can contribute to the coefficient of #71, which must be zero. Hence 
the indicial equation is formed as before: 


K(2n ~ 1) = 0 


whence we obtain the two choices: x = 0, $. Next, we equate to zero 
the coefficients of £, to which the first and second summations contribute. 
This leads to 


Q(x + 1)(2e + La = (4x? — a + 16b)ao 


so that 
a, =} 4? — a + 16b 
l RED +)” 


from which a, may be determined when the arbitrary constant ao is 
assumed, On equating to zero the coofficient of etl to which all three 
summations contribute, one gets 


Q(x + 2)(2e + 3a ~ (4(x+ Ll? — a+ 16b]a, + 32ba = 0 


a relation permitting the calculation of az, ete. In this way two series can 
be constructed, one for x = 0, the other for x = §, linear composition of 
which yields the general solution of (74) and hence of (73). Investigation 
shows that this solution converges fiL <1. 

This general sohition, however, is rarely of interest in physics and 
chemistry, for it is not periodic in =. In most problems leading to 
Mathieu's equation, x is an angle, so that there is no significant distinction 
between x and x + 2nz, where 7 is an integer. Thus the solutions usually 
sought must have the property that y(x + 2rn) = y(z). The general 
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solution here found, which is of the form 


Lag! + a ae (2-75) 


does not possess this periodicity, as closer investigation would show. 
Qualitatively this defect is apparent from the failure of the solution to 
converge for § = +1, which excludes the values x = nr from considera- 
tion altogether, as well as from the existence of a branch point of (75) 
at ¢ = 0 (arising from the factor &*/?). 

In fact it is impossible to obtain solutions of Mathieu’s equation which 
are periodic and of period 2r in z, unless definite restrictions are placed 
upon the constant a. It turns out that the latter must be a complicated 
function of b if the solution is to be periodic.” 

Floquet’s Theorem. An important theorem concerning the general 
solution of Mathieu’s equation, or indeed of any linear differential equation 
with periodic coefficients which are one-valued functions of x, will now be 
established. Suppose that 4, (2) and y(x) are two linearly independent 
solutions of (73), so that any particular solution y may be compounded 
from them by means of two constants A; and A, as follows: 


y = Ayı + Aoys (2-76) 


Now it is clear that, if yi(2) and ye(x) are solutions of (73), yı (a + 2r) 
and yo(x + 27) will also be solutions, for the substitution of z + 27 in 
place of x causes no change in the differential equation. This must, of 
course, not be interpreted as implying that y(x + 2r) = y(x) and 
yo(x) = yo(x + 2r); but it does mean that 

ylz + 2r) = cy (z) + asyle); yalt + 2m) = azy (x) + azy (2) 


the a's being constants. Similarly, using (76) 


ylz + 2r) = Ayyi(z + 2r) + Azy2(2 + 2r) 
= (Aira + Asoer)yi(@) + (Arairz + A2a22)y2(2) 
We observe that the constants a are fixed by the choice of y, and ye, but 


A, and Áz may be chosen at will and still leave y a particular solution of the 
equation. Itis possible to choose them so as to satisfy the equations 


Ajay; + Aoa; = kAy; Aias + A2023 = kAg (2-77) 


where k is a constant not within our control, for if eqs. (77) are to be satis- 


16 Cf. Whittaker, E. T., and Watson, G. N., “A Course of Modern Analysis,” 
Fourth Edition, Cambridge Press, 1940, for further details regarding periodic solutions. 
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fied then k must be subject to the equation 


an= k a2) 
=0 (2-78) 
aa az — k 
But if (77) holds then 
yla + 2r) = klAiy (z) + Ay2(2)] = ky(z) (2-79) 


In other words, there exists a particular soltition y(x) such that, when z is 
increased by 27, the solution itself is multiplied by the constant k. If k 
were unity, this solution would be periodic. 

This result may be expressed in a different way. On putting 


k =e? ylz) = P(x) 
eq. (79) reads 
Et P(x + 2r) = erete P(z) 
so that P(x) turns out to be a periodie function. Thus it is seen that there 
exists a particular solution of Mathieu’s equation of the form 
y = e" P(x) (2-80) 


where P is periodic. From here it is only a simple step to obtain. a general 
solution of (73). The differential equation is insensitive to the substitu- 
tion of —zforz. Hence €e ”"P(—z) must also be a solution. Moreover, 
it is an independent solution, since it is not a constant multiple of (80). 
The complete solution is, therefore, a linear combination of these two: 


y = ae" P(x) + cee ™P(—z2) (2-81) 
This result, known as Floquet’s theorem, is of interest in some astronomi- 
eal applications and chiefly in the quantum theory of metals.!” 
Problem. Show that the Schrédinger equation 

g? 

E +A + VON =o, 
in which A is a constant, and V is a periodic function of z such that V (z +1) = V (2), 
has solutions of the form 

y = e*y(z), 


where vis also periodic: v(x +1) = v(x). 
This is sometimes called Bloch’s theorem. !8 


1 See Seitz, F., “ Modern Theory of Solids,” McGraw-Hill Book Co., New York, 
1940, Chap. VIII. 


18 Bloch, F., Z. Physik 52, 555 (1928). 
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2.18. Pfaff Differential Expressions and Equations.—The equations of 
thermodynamics are peculiar inasmuch as they usually occur in the form 


aW = E Xde (2-82) 


where the X are functions of some or all the independent variables 2. 
While (82), which is known as a Pfaff expression, is not a differential equa- 
tion of the customary kind, its importance in chemistry and physics requires 
consideration. It is for lack of a more adequate place that this material is 
inserted in the chapter on differential equations. Some of the material 
which will be developed from a mathematical point of view in this section 
has already been used in Chapter 1, to which reference should be made for 
further applications. The equation 


> Xda = 0 
A= 


is sometimes called a otal differential equation or, more generally, a Pfaff 
equation. 

Clearly, the expression dW, eq. (82), can be integrated along any path 
in n-dimensional space, but the integral will in general depend on the path 
of integration. (See Prob. a, p. 87; also the example in sec. 1.8.) 


When f dW depends on the path of integration, dW is said to be incomplete 


or inexact. 
The condition that (82) be a complete differential is 


dW = df (2132: En) (2-83) 
2 
for then f dW = f(t2) — f(£1), independently of path. Now 
Ti 


ð 
df = E Z dn 
» ÔD) 
Comparing with (82), we find 
PERCA 
OX 
To state this relation without explicitly mtroducing the function f, we 
differentiate it with respect to £u, u = A. 


aX, _ _ of 
ox, O2) OL, 
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But also 


Hence the necessary condition of “ exactness ” may be written in the form 
aX, aX, 
ar, Ox,’ 


Ame lesen (2-84) 


The reader who is already familiar with vector analysis will note that, if 
the X, are interpreted as components of a vector R, (82) may be written 


dW =R-dr (2-82’) 
and the condition of ‘ exactness ” becomes 
aX ðY dX ðZ ðY ðZ 


or 
VxXR=0 (2-84") 


These results are of importance in vector analysis where they are usually 

expressed as follows: The condition that the line integral of R (expression 

82’) around any closed curve shall vanish is that R be the gradient of some 

scalar function, and this is equivalent to condition (84’). (Cf. sec. 4.17.) 
We return now to the general situation: 


dW is not exact 


and distinguish two cases: 


A The equation dW = 0 has a solution. 
B The equation dW = 0 does not have a solution. 


A. The equation dW = 0 possesses a solution. Leaving aside for the 
moment all considerations as to when such solutions may be found, we shall 
first sketch the consequences of the existence of solutions. The equation 
dW = 0 assigns to every point a direction, or, what amounts to the same 
thing, an element of surface. (From the point of view of vector analysis 
this is immediately clear because the relation R - dr specifies at every point 
(21 +++ £n) the direction dr which is perpendicular to the vector R.) 

When integrated, the equation dW = 0 leads to 


$(2y%2 +++ En) =C (2-85) 


which represents a one-parameter family of surfaces in n-dimensional 
space. These surfaces consist of the elements specified by dW = 0. 

We now wish to show that there exists an integrating denominator, 
t(%, +++ Za), such that dW /t is an exact differential. The proof is as follows. 
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Along the surface ¢(2,---2,) = (ef. Fig. 2-6), we have both dg =0 
and dW =0. Thesame is true along a neighboring surface ¢ = c + de. 
Suppose we wish to go from A to C. The change occurring in ¢ is de, 
no matter whether the crossing is made at B, or at By. But the change dW 
will depend on the path. The important point to note is that no change 
occurs in W as we pass along either curve; a change can occur only at the 
crossing: dW = function of the point at which the crossing is made. (If 
dW = 0 along the two curves, then it would depend on the whole path, not 
merely on the point of crossing!) 
Hence dW = t(B)dd, where B is 
the poiné of crossing. Hence dW = 
t(%1 +++ 2n)d¢, or 


g=ctde 


ino 
t 
But dọ is an exact differential. 
Along the curves ¢ = const., the equation F(¢) = const. will likewise 
be satisfied if F represents a unique, single-valued function. If, then, we 
use F'(¢) in place of ¢ in the preceding analysis, we are led to 


dF = m instead of dọ = an 


Since, however, dF = (dF/do)dd, we see that T = t/(dF’/d¢) is also an 
integrating denominator. It is clear that, if there exists one integrating 
denominator ¢ for a Pfaff expression, an infinite number of others can be 
formed by the above rule. 

Only the points on the surface @ = c are connected with A by paths 
along which dW = 0. It isclear that in the neighborhood of A there is an 
infinite number of points not connected with A by such paths. Hence the 
fact, important in thermodynamics (though somewhat trivial geometri- 
cally!): 

If the inexact differential dW possesses an integrating denominator t, then 
there exist, in the neighborhood of every point P, innumerable points which 
cannot be reached from P along paths for which dW = 0. 

We now consider the question of how to find the integrating denominator. 

1. Case of two variables. First solve the equation 


dW = 0; Xdx+ Ydy =0 (2-86) 
The solution is 
y=f(z,c), or oy) =c (2-87) 
Along the curves (87), ¢,dz + ¢,dy = 0, hence 
dy bz 


de 4, (2-88) 
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But from (86) 
dy _X 
de Y 
so that 
oz X dz y 
sg = Ẹ = ulan) 
» Y’ X 
Now 
aw 
dọ = y= drdt + ody = uXdr + uYdy = udW 
Hence 
pL Lk LY. 
u oz $y 
by (89). 


2. Case of three variables. First solve 
dW =0; Xdz + Ydy + Zdz = 0 


The solution is 


$(z,y,2) =c 
Along these surfaces, ¢.dx + pudy + ¢,dz = 0, hence 
dy} _ _ %9 dz; de de $y 
dz |z E Py ' dz uv bz dy iz oz 
But from (91) 
dy X dz xX æ Y 
dL TE aL Z’ ale” Z 
Hence 
oe X w X wY 
dy Y? h Z? Q Z 
or 
Now 


aw 
dp =- = dada + ody + bede = u(Xde + Ydy + Zdz) 


Therefore 


Similarly for more than three variables. 


2.18 


(2-89) 


(2-90) 


(2-91) 
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We now consider the condition that the equation 
dw =0 


shall have a solution. (Condition of integrability.) 
Suppose a solution of 2 Xadra = 0 exists in the form 


A 


plti- En) =c 
Then 
ð$ 
U(E > i En) Xi = oe t= 1,2,---n (2-92) 


Let i, j, k, be different indices. It follows from (92) that 


ð 3o ð 
ôT (uX;) 02,02; g Ox; (uX) 
whence 
ax; dX ‘) _ ðu ðu 
ÔT; Ox; ~ at Ox; j Ox; 
Similarly, 
(2%: — 2X8) yx, 
Tk Ox; Ox; Ox, 


— oo j = ee ko 


Ox OX; ðu ð 
u ( k 2 X; Xr 
Oz; OX; Oy Ox; 


Multiply the last three equations by Xr, X;, and X;, respectively, and add: 


OX ; OX; (= Za) (= ax ; 
er ae X; oo X;|-— -——) = = 
Xi (= =~) +A Or, Ox; + Oz; Oxy ) QO (2-93) 


By closer analysis, this equation may be shown to be both necessary and 
sufficient: it represents the condition of integrability for the Pfaff equation 
dW =0. In three variables, eq. (93) takes the form 


R-VXR=0 


provided R is interpreted as the vector having components X1, Xo, X3. 
The total number of equations of the form (93) is equal to the number of 
triangles that can be formed with n given points as corners; it is therefore 
An(n — 1)(n — 2). These equations are therefore not independent. 

It is to be observed that, in the case of two variables, eq. (93) is always 
satisfied. Hence every Pfaff equation of the form 


Xdz + Ydy = 0 


possesses a solution. 
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B. The equation dW = 0 does not possess a proper solution, i.e., eq. 
(93) is not satisfied. For simplicity, we consider only the case of three 
variables, where the solutions can be visualized easily in ordinary space. 
Generalization to more variables introduces no complications. It will be 
seen that “improper ” solutions of eq. (82) are still possible, but that they 
represent a greater variety of functions than the proper solutions considered 
in the preceding paragraphs. 

We now choose an arbitrary relation 


¥(z,y,z) = 0 (2-94) 


and impose this upon eq. (82), thereby effectively eliminating one degree of 
freedom. From (94) and its differential form 


Yzdz + py + ¥.dz = 0 
the variables z and dz are obtained in terms of x, y, dx, dy, and these solu- 
tions are substituted in eq. (82). It will then be of the form 
Adz + Ydy =0 
and this has a solution 
o(z,y) = 0 (2-95) 


The improper solutions of (82) are said to be those curves which satisfy (94) 
and (95) simultaneously. They represent, therefore, prescribed curves 
upon arbitrary surfaces. Further investigation would show that every 
point in the neighborhood of a given point can be reached by a continuous 
curve satisfying (94) and (95) from the given point, the state of affairs being 
quite different from that described under A. 


Problem a. Let dW = 2(dz + dy). Compute the integral s ° dW along two 
ti 


paths: 
1. riyi > T21 > Tayo. 
2. Divi — Tiyı —> Toyz. 
Show that the two results differ by the area enclosed by the two paths of integration. 


Problem b. Show that the expression 
dW = —ydz + zdy + kdz = 0 
where kis a constant, does not possess an integral. !° 
18 See Born, M., Physik. Z, 22, 250 (1921). 
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CHAPTER 3 
SPECIAL FUNCTIONS 


3.1. Elements of Complex Integration. Theorems of Cauchy.—Some 
acquaintance with the calculus of complex variables facilitates this work; 
hence the present section and the next will outline the elements of this 
useful subject. 

As to notation, 7? = —1, and the symbols z and y are used for single 
real variables. Furthermore, 


= g + iy = pe”. 


Through the Argand diagram, which consists of a real axis along x, an 
imaginary axis along y, and presents z as the point with rectangular co- 
ordinates x and y, this last relation is at once made clear: p? = z? + 4°, 
6 = tan 'y/z. 

Now let f(z) be a single-valued function and analytic in the sense that 
it has a unique derivative with respect to both x and y at every point of the 
Argand plane. We may then write, in terms of two new analytic functions 
u and v, 


f(z) = ulzy) + wy) 
Hence follows an important result. Since 
of _ df ou, ov 


ðr de ðr On 
and 
; of df ðu ðv 


ay ‘dz dy ay 
one finds on’equating real and imaginary parts of df/dz that 
ðu æ av ðu 


ôr ay as ay 


on 
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These are the famous Cauchy-Riemann conditions which the components 
u and v of an analytic f(z) must satisfy. Further differentiation yields 
another important set of relations, which we shall often use hereafter: 


gu 4 u av + dw 0 
ax? * ay ax? * ay? 


Cauchy’s theorem asserts that the integral 


frou = 0 


provided it is taken along a closed curve C on and within which f(z) is 
analytic. A simple proof is as follows. 


Sio = f uas + idy) + w(dr + idy)] 
c c 


= f ude — vady) + i f ode + udy) 


By virtue of the Cauchy-Riemann conditions, both of the final integrands 
are exact in the sense of sec. 1.7, and the line integral around the closed 
contour vanishes. 

Equally important is an extension of Cauchy’s theorem to which we 
now turn. Suppose again that f(z) is analytic within and on a closed curve 
C in the Argand plane, and denote by zo a fixed point within C. The fune- 
tion f(z)/(z — zo) will then have a singularity at zọ and its line integral 
along C will not be zero. But the value of this mtegral will remain un- 
changed if we alter C, so long as the contour does not cross zọ. This follows 
at once from Cauchy’s theorem, for the difference between the old and the 
new value of the integral will itself be a line integral around a region over 
which f(z)/(z — zo) is analytic, and will therefore vanish. Let us denote 
the infinitesimally small circle of radius p surrounding the point zg by T. 


i®),  fI@ , do d(pe'*) 
Progen [oS we = 16) [= 500) f pe” 


= Seri f "a0 = 2nif (eo) 


provided we traverse C in counter-clockwise fashion. Hence the theorem 


1 Fio 


2rivcz — zo 


dz = f(z) (3-1) 
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Henceforth we shall understand that f denotes counter-clockwise (so- 


called positive) integration along a 
closed curve C. 
3.1a. Theorem of Laurent. Resi- 
dues.—Let f(z) be analytic on the ring 
formed by two concentric circles C4 and 
C including these boundaries. (See 
Fig. 1.) Apply eq. (1) to the point RB 
zo = £, choosing a contour which goes 
from A along C to B, thence inside 
to Cı, around C, in a negative sense, 
and finally back to A. The two hori- 
zontal portions of the path make equal 
and opposite contributions to the in- Fie. 3-1 
tegral and therefore cancel. Hence 


f@, 1 I 
iO=s5 5 de- z fo ae (a) 


In the first integral we may write (zg now denotes the common center 
of Ci and C3) 


1 1 1 ol È (: = a) 
z—-£ £-— 2 1 — Zo 2 20x20 \% — 20 
and obtain a series which converges on Cz. Therefore 


IO ay = E alt zo) (b) 


Qridc,z—f X=0 


A 


provided we define the coefficients 


a= f Ss (o) 


2ri Jo (z — zop 


In the second integra! of (a) we use the convergent expansion 
1 1 1 __1 £ (: — a) 
GS-2 f= , 22% F—*% N0 


IO 
z 2 — r” 


so that 
= — E b(t - o) (d) 
A0 
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provided 
db = aff (z — 29) dz. (e) 


Now put > = —p — 1. Then b, becomes a,, except that the integration 
in ba is along Cy, in a, along Cy. But the integrands of (c) and (e) have 
no singularities within the ring, hence the path of integration may be taken 
the same in both of these expressions. We may indeed take it to be any 
curve, C, within the ring, which encloses the point ¢ Thus, because 


the sense of C2 is opposite to that of Cy, bj. = —a,. 
Eq. (d) now reads 
1 z -1 
1f hes E at-a () 


Qari Gz— % 


Ba o 


When we add (b) and (f) to form (a), and replace ¢ by zin the final formula, 
we find 


f@ = E ae- xz) a) 
A=- 0 
with (8-2) 
1 d 

Iri Ie) Fi b) 
2ri Jc lz — zo) 
Eq. (2) is called Laurent’s theorem. It shows that a function f(z), free from 
singularities on a circular ring, can be expanded as a Laurent series (eq. 2a), 
which contains negative as well as positive powers of z — zo. 


The term a_,, formed by means of eq. (2b), is especially interesting. 
It is 


Qn = 


a4 = = f Tlzjdz (3-3) 


As to the contour C, we noted that 
it includes the point z. If f(z) is 
analytic at 29,a_, = 0. The theorem 
is useful, therefore, when zo is a singu- 
lar point of f(z). The constant a, is 
called the residue of the function f(z) at 
Zo, and eq. (3) is known as the theorem 
of residues. 

If f(z) has a number of singulari- 
ties Zo, 21, 22, etc., within a contour 
C, this path may be distorted in the 
manner shown in Fig. 2 to form C’. 
Fic. 3-2 Since the triangular path, exclusive 
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of the circles, lies in a region free from singularities, it contributes nothing 
to f fdz. The remainder comes from the singularities and is equal to the 


sum of their residues. Hence 


J f(z)dz = 2ri (sum of residues within C). 


Example. To evaluate the integral 


Satin 

—,a-+ bcos ¢ + cgin e 

Let z = e*; g = —i log z, dọ = ~i(dz/z). Then cos ¢ = $(z +271), sin ¢ = 
(1/22) (z — 27}). 


, dz 
1- -i f — 7 
are © pe 
az +5 (2 +H+56@ 1) 


the contour being the unit circle about 0. The denominator of the integrand may be 
written 


4O — ic? tas +40 tic) = 40 - io) [z -yg (-a + | 
x[ -z a- W] 


Va —b -2ER 


provided we put 


If a? — b? — c? > 0 then 


<1 


| g a+) 
~ ic 


The other root > 1 and lies outside the unit circle. The residue of the integrand at 
z = (—a + R)/(b — ic) is 


1 -4 
R 
$0 — tc) es (~a +R) - bo (~a -2)| 
Therefore 
I = =. wila? ~ b? — Ayi? 
2r 
Vai — b? — e 


3.2. Gamma Function—The gamma function is a generalization of 
the factorial n! for non-integral values of n; more specifically, T(z) is 80 
chosen that, if n is an integer, T(n) = (n — 1)! A fundamental defini- 
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tion, due to Euler, states 


1:2-3 a=) , 4 
ro = im gyer j 6 


Several important properties of the T-function follow at once from this 
definition. Since from (4) 


1-2... (n — 1) 
P@ +1) = lim Cea) GFA 


en 1-2- (n — 1) 
re+1) = lim Ca) eF. -@+n—-) 


On the other hand, (4) also shows that 


sti 


nē = 20 (z) (3—5) 


r(1) = lim z =1 (3—6? 


nson! 

From (5) and (6) it is at once apparent that, if n is a positive integer» 
T(n) = (n — 1)! (3—7) 

as was stated above. It is also evident from the definition (4) that r (2) 

becomes infinite at z = 0, —1, —2, etc., and that it is an analytic functior 

everywhere else. 


It is often useful to represent T(z) by means of a definite integral. "CO 
achieve this, we consider the function 


Plen) = S € — y tdi (3—8) 


wherein n stands for a positive integer, and the real part of z is taken to be 
greater than zero in order to insure convergence of the integral. The 
transformation r = t/n converts F into 


1 
Fen) =r | A- a 
o 


The integral appearing here may be evaluated by repeated partial inte- 
grations: 


1 . z7 1 
_ pte lanya faya 
J (1 — 1)*7? tdr [a 7) =| P f (1 — r)" 4dr 


The integrated part here vanishes at both limits, and the remainder may 
again be subjected to a partial integration, yielding 


Ha — te + + s f Gd — 7)", “atl 
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The integrated part is again zero. By continuing this process we find 


F(z,n) = n(n = D. ‘etn L-2---n 2 
Gmn) z(z + 1)- ed +n- 1)” wfo dr = z(z+1)-- Fna 


As n approaches infinity, this expressiow*becomes identical with (4); hence 
tm F(zn) = T(z) (3-9) 


On the other hand, since e = lim (1 + 1/p)? and therefore 
pe 
e= lim (1+ 1/p)?* = lim (1+ z/n)” 
pr —> o n => o 


the quantity (1 — t/n)” appearing in (8) approaches the limit ¢*. We 
conclude, therefore, that in view of (8) and (9) 


J T tla = re) (3-10) 


This result is valid, we recall, when the real part of z is greater than zero. 

A definition of the T-function, or rather its reciprocal, by means of an 
infinite product has been given by Weierstrass. Since it is a useful one, we 
shall here derive it by simple steps (the rigor of which is not always obvious) 
from Euler’s definition (4). We first note that the product 


en O am A e O 


1 n—l 
which appears in (4), may be written - TT (1 + 2/m)—1, so that (4) be- 
comes 


re) =-= L im x i(i + zy" 


B n> o 


or 


tO =z lim n “Ti(a + z) 


If we multiply the right-hand side of this equation by unity in the form of 


[ im a TT lim fi] 


b © nore 1 
we obtain 
i =z [ lim aa fT lim i i+ z) em) 
T(z) n=» o nwo i ,m 
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Now the infinite series: lim (1+ 3+ ---1/n — log n) = C converges; 


n => 


it has the value C = 0.5772---, known as the Euler-Mascheroni constant. 
Hence 


— = op? —aln _ 
7a ze it(1+ 2). (8-11) 
which is the Weierstrass definition. It shows, again, that T(z) has poles 
atz = 0, —1, —2, ete. 

A further important property of T-functions, namely the relation 


rW — z) = (3-12) 


TZ 


is readily derived from the Weierstrass definition. First, we recall the 


theorem: 
si G3 2 
~~ =I ę — 5) (3-13) 
TË 1 Nn 


which may be proved by an expansion of the infinite product as a sum of 
powers of z*. (The details are left as an exercise for the reader.) From 


(11), 
1 -i 
T@)r(~z) = — By (1 + +2)" (a _ z) 
1 -1 
-a0 -a) 
~ z sin mz (8-14) 
the last step because of (13). But in view of (5) 
T(-z) = — z Tal -z2) 


and this, when inserted in (14), yields (12). 
Several other formulas for the derivation of which the reader should 
refer to mathematical treatises,’ will now be listed without proof. 
P(z)T(@ + 3) = 217x127 (22) (3-15) 
An infinite product of the form 
l~a 2-—a 3—a 
1-b | 2—b ` 3-5 
may be expressed in terms of I-functions: 
an—a ra — b) 
In — b7 ra - a) 
tee, for instance, Whittaker and Watson, p. 235. 
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Also, 
Wy inatb+n)  T@+1ret+)) 
1(@@+njb+n) Tla+b+1) 


If m and n are positive constants, not necessarily integral, we have 


m+n 
r( 2 ) 
This relation may be modified as follows. Put m = 2r, n = 2s, and 


introduce the new variable of integration cos? z = u on the left. The 
integral will then be converted into 


1 
f ut? (1 — u)* du 
o 


(3-16a) 


x/2 
2 f cos”? z sin”™! zdr = (3-17) 
o 


which is a function of r and s known as the Eulerian integral of the first kind, 
or simply the B-function, and denoted by B(r,s). Eq. (17) may therefore 
be put in the form 


_ r@)r(s) , 
B(r,s) = T(r +8) (3-17’) 
The logarithmic derivative of the T-function is given by 
d œ et et 
Ž in ro = Í (Z -= =) dt (3-18) 


if z = real part of z > 0, as was shown by Gauss. 
From this result it is possible to obtain an expression for In T (z) which 
is useful in evaluating T (2) for large values of z: 


m re) = -pme 2+ 4mm Cx) +0(2) (3-19) 


where 0(1/z) represents a series of terms which vanish for large z at least as 
strongly as 1/x. For real z, (19) takes the form of Stirling’s series, when 
written for T instead of its logarithm: 


T(z) = 


ere any + 1 139 571 


12x + 2882? 51840r? 248832024 


Itis valid when z is large. This expansion may be used for the approximate 
evaluation of factorials of large numbers: 


N! = NIN) = YNY (rN) P(A 4 -- -) (3-21) 
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In concluding, let us compute a few numerical values of the T-function. 
Tt has already been noted that 


r(0) = ©, (1) =1, r(2)=1, T(3)=2!, r4) =3! ete. 


If the values of T(x) in the interval 0 < x < 1 are known, T(x) can be 
computed for all real positive z by means of (5). T($) is easily obtained 


from eq. (10): 
r) = f er Y dt = 2 f edz 
0 0 


if x? is written for i Hence I'(3) = Ve. The same result could have 
been obtained by putting z = } in (12). Thus we find: r($) = r“, 
T(8) = 4r”, 7B) = $r, PG) = 4r”, ete.” The qualitative behavior 
of T(x) is plotted in Fig. 3. 


Problems. 


a. Prove eq. (13) by expanding the infinite product. 

b. Prove eq. (17’) directly. Hint: Express T(r)I (s) as a double integral in 
accordance with eq. (10). Next, put the two variables of integration, respectively, equal 
to z? and y? and then transform to polar coordinates. The radial integral will be 
T(r + s), the remainder B (r,s). 

c. Show that B(r,s) = B(r + 1,s) + Bis +1). 


3.3. Legendre Polynomials—Of the solutions of Legendre’s equation 
(2.28), the functions denoted by P(x) in sec. 2.11 are of greatest interest 
because they remain finite at x = +1. In physical problems, the argu- 
ment of P; is usually the cosine of an angle and has therefore the range 
—-l1S2s1. P; is definite at the endpoints of that range; the other 
solutions are not. Hence the present discussion will be restricted to the 
polynomials P;. We repeat their definition: 


Pi(z) = 

anal 1 FO =) me HDE- 20-38) ng | E 
PUP ~2@—-1* + 24e- e-s)" (3-22) 
Specifically, 

Po =1,P, = z, P =} (82? — 1), Pa = 4 (52° — 32), Py = } (851* — 302? +3), 


etc. 


An interesting representation of P, is easily established. When the 
function 

F(zy) = (1 — 2ey + yy? (3-23) 

is differentiated n times with respect to y and y is then put equal to 

zero, the result is seen to be l!P)(2). Hence if F(a,y) is expanded in a 


2 Tabulated values of the F-function fer real arguments may be found in Jahnke, E., 
and Emde, F., “ Tables of Functions,” Dover Publications, New York, 1943. 
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no ty 
H Tt 
| 7 


N\A 
MT | Te 
[ty ty ty 
Pt P 
pty tT 
pt inf | 


Fie. 3-3 


MacLaurin series about the value (2,0) the result 


aF (x,y) y PF (ay) 
ay 2! ay 


y oF (ey) 
1 


yo HE ay 


F(z) = F (2,0) +y 


d... 


v=0 
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becomes 


F(ay) = (1 —2ay $y)? = x Pilay (3-24) 
=f) 


This relation has meaning only when the right-hand side converges. 
Suppose that | z | < 1 which, as pointed out, is the case in most applica- 
tions. P(x) will then also lie between 1 and —1, for the definition of P, 
shows that every 

Pi) =1 


and that | P:(x)| <1 for 

{2|<1. Thus the coef- 

ficients of y! in (24) are never 

greater, in absolute value, than 

1, and the series converges 

when y < 1. . 
Theorem (24) is of interest J 

in the calculation of the poten- 

tial due to a static distribution 

of electrical charges. In terms 

of Fig. 4, which depicts a dis- 

tribution of charges gı '- -q4 

of different magnitudes and Fie. 3-4 

possibly of different signs, the 

potential at P is 


“4, 


TER RET cs +7 
rR ° \R 


On identifying cos @; with z and (r:/R) with y in (24) one obtains 


a 
With the use of the definition 
Qi = LairiP1(cos 4;) (3-26) 


this result becomes the multipole expansion of the potential arising from an 
electric charge distribution: 


y= > as (3-26) 


The monopole strength Qo is 3q:; the dipole strength Qı is Lar; cos b; 
+ 
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and it represents the component of what is called the dipole moment 
of the charge distribution, L¢,r;, in the direction toward P. The quadru- 


pole strength, Q = LariP. (cos 6;), is a scalar quantity constructible from 


the components of a tensor called the quadrupole moment,? and so on. 

Q; is called the strength of the 2’-pole of the charge distribution. Its 
value depends on the choice of origin. If all charges have the same sign, 
Qı can be made to vanish by a suitable choice of origin. Furthermore, 
Qe can be given an especially simple form by choice of origin and axes, etc. 
Similar remarks are true about multipole moments. 

The reader might find it interesting to verify the following statements. 

(1) Two equal charges of opposite sign produce a dipole moment which 
is independent of the choice of origin. Their quadrupole moment can be 
made to vanish by taking the origin midway between them, in which case 
all Q with even subscripts vanish. 

(2) Four equal charges disposed-with alternating signs about the 
corners of a parallelogram produce a zero dipole moment and hence a 
vanishing Q,; the quadrupole moment for a given orientation of axes is 
finite and independent of the choice of origin. Qe depends on the angle 
of orientation. 

(3) Acontinuous spherical distribution of charge has a finite quadrupole- 
moment tensor, but vanishing Qe. 

The entire analysis leading to eq. (25) presupposes, of course, that every 
charge is closer to the origin 
than the point P, since the 
requirement y = (r:/R) <1 
must be obeyed. 

From the foregoing re- 
sults one can derive a use- 
ful expression for P;. Letr 
be a vector extending from 
the origin, Az an increment 
in the z-direction, and 
R=r+aAz. (Cf. Fig. 5.) Fie. 3-5 
If then we express 


2-1/2 
R = [? + (Az)? + 2rAz cos ey? = : £ + 2Ae cos 6 + (=) J 
r 


r 


3 If we label the Cartesian components of r; by ri = zi, rh = ya 73 = z; the tensor 
in question is Tim = SGiriris. In the physical literature the terminology is sometimes 
7 


confused, the multipole strength being identified with the multipole moment. It is 
correct to say that the multipole strength, i.e., the term appearing in the expansion, is a 
scalar form involving the components of the multipole moment, a tensor of rank l. * 
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by means of (24), putting —cos@ = z and Az/r = y, we have 


2 LIS p- -cos 0) (= = (3-27) 


R T [ead 


On the other hand, if 1/R = 1/|r + Az | is expanded in a Taylor series 
about R = r the result is 


1l êf! aoa (2 . 
iaje hG) + I! dz! + (3-28) 


On comparing the coefficients of (Az)! in (27) and (28) it is seen that 
1lé/f 
=a P i(-~ cos 6) = 5G 3) 


T! de! 
Since P;}(—z) = (—1)'P;(zx), this is savivalont to 
1 
P, (cos 8) = (= an rtl — al 2A (3-29) 


In using this relation it is understood that cos@ = z/r, and r? = 
PHPH. 

Another expression for P; involving an /-th derivative, and in some sense 
simpler than (29), is known as Rodrigues’ formula, To obtain it we observe, 
first of all, that 


(a? _ 1)! = 2 (— 1) a =i 2’) 


in accordance with the binomial theorem. When this expression is differen- 
tiated J times, there results: 


ds, Ti (21 — 2)! 
— — 1) = — 1) —— BEE 2) 
wE TD = 2(-O Say Gay!” 
the summation extending over all integers \ including 0 until à equals 
either Z or $(1 — 1). The right-hand side of this equation may be written 
(21)! Ud — 1) qh? l 
T “sap? t] 
Hence, from the definition of P, eq. (22), 
1 æ 
ai agi @ — DV! (3-30) 
which is the formula of Rodrigues. 


An integral representation of P; is due to Schlaefli. It can be derived 
by combining eq. (30) with the fact expressed by eq. (2). If the latter 


Pi(z) = 
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equation is differentiated | times with respect to the argument of f(a), the 
result may be written 


d! I! T(z) 
a ~ ani @= ay 
Now choose f(z) to be (x? — 1)!, so that 
ds, , U g@—1! 
mD =f Gg lpm 
On comparing this with (30), it is seen that 
a! z — 1 I 
Piz) ==> Eai (3-31) 


The path of integration here is understood to be some contour enclosing 
the point z in a counter-clockwise sense. From this result, which is known 
as Schlaefli’s formula, it is possible to derive the formula: 


Pilz) = * [te -+ Va? — 1 cos gl/dy (8-32) 
TYO 


To do this, one may take the contour to be a circle of radius V| x? — 1 |, 
so that z in (31) is x + Vz? — 1 e” and y varies from —r to +r. The 
integral then becomes 


2 et fa? — 14 2a? e + (o? E] a ie, 
Pi(z) a eee z’ — leide 


== f le + VT cos olay 
Drd e 


This result is equivalent to (32) because the integral from — r to zero is 
equal to that from zero to 7.* 
0 a 
‘In general, the definite integral f I (z)dr = f I (z)dz if the integrand is an 
-a i) 


even function of z, that is, if I(—z) = I{z). To see this, change the variable of inte- 
gration to —z, and make a corresponding change in the limits: 


0 0 æ 
f I(z)dz = - [i-a - Pred f Tods 


H I(z) is odd, that is if T(—z) = —I (z), 


f row = — Sioa so that foro =9 


3.4 SPECIAL FUNCTIONS 104 


3.4. Integral Properties of Legendre Polynomials.—Integrals over 
products of Legendre polynomials, which are needed in many quantum 
mechanical problems, are best obtained with the use of Rodrigues’ formula. 
We wish to calculate 


1 
f Pi{x)Pu(x)dz 
-1 


First, we suppose that l’ > 1. Substituting in accordance with eq. (30) 
this integral becomes 
1 1 d! aë ; 
yy dt (x? — 1)? ` dnt (a? _ 1)"dx (3-33) 


After 1’ successive partial integrations, in which all the integrated parts 
vanish because every derivative of (x? — 1)'is zero at z = +1, the remain- 
ing integral reads 


-17 pate de 
amr. Ir @ -i è- ds (3-34) 


But the (Z + 2’)-th derivative of (x? — 1)' is certainly zero because the 

highest power of z in (z? — 1)! is 2?/, and l + I’ is, by hypothesis, greater 

than 2]. Therefore the integral vanishes. This is clearly true whenever 

U =Æ l; for if l should be greater than 1’ we need only “‘ unpeel ” the deriva- 

tives appearing in (33) in the reverse manner by partial integrations, and 

we are left with an expression like (34) but with Z and /’ interchanged. 
Next, suppose that 2 =i’. The integral in (34) then reads 


1 q?! . 1 
Je — D Šg @ — D'de = (2)! Je — dz 


the latter because the only term of (x? — 1)’ which will not vanish after 2 
differentiations is z?. But on putting z = cos 8 it is seen that 


2 — 1)'dx 1.2 7/2 o om do (—1)'- 2°12) 
J TD pa, oe ~3-5---Ql+41) 
Collecting coefficients, we find in place of (34) 

(—1)' (=1) - 2 2 

saa Dre = M 

aqy 3 -IFI AF1 
Our results may be combined in the formula 

S. Pı (a) Py (x)dz = W+1 = 13 (8-35) 


‘The symbol 8;,, here employed is freely used in mathematics and physics; 
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it is called the ‘‘ Kronecker ” 6 and represents a discontinuous factor which 
is taken to be unity when the two subscripts have the same value (J = I’) 
but is zero when they are not equal. 

3.5. Recurrence Relations between Legendre Polynomials.—Rela- 
tions between Legendre polynomials are most simply derived from Schlae- 
fli’s representation of P(x), eq. (31). We first observe that 

(2 — 1)! 2(2? — 1) de (2 — 1)! 

(z — z) E-r” E-r) 
The first term on the right of this equation, however, may be transformed as 
follows. Since 


d z2 — 1 141 (2? — 1% (2? — 19t: 

f E = Dg 4 m o 

Ae — z) C+) [ z (z= r)! (z— Za] 
and since the integral of the left-hand side around a closed contour must 
vanish, we find 


dz (3-36) 


z(e — 1)! i LE- 1I” 
——_ 7 dz = 5 9 —_ & 
(z _ a) 2 (z — z) t? 
Equation (36) thus reads 
(2? — 1)! 4 (2? — 1)+1 (2? — 1)! 
Goa" tf GaP” e- z)" 


Reference to equation (31) allows the two terms on the right to be identi- 
fied, after multiplication by “ee with Pals) — P(x). Hence 


@-1y 
Pui(e) — #Pi(z) = = G-a dz (3-37) 

When this is differentiated with respect to z, there results 
Pin (£) — zPi (£) = 0 + 1)Pilz) (3-38) 


which is the first important relation to be derived. It connects Legendre 
functions, and their derivatives, of degrees l and Z + 1. 

A relation connecting Legendre polynomials of three different degrees 
may be deduced in a similar way. Clearly, since 


§ [ar l*-° 


(2 — 1)! a 2?(2? — 1)*? DT ae 2-1), 
fs G dz raf Fy -1 Gap dz =0 


In the second mand, we introduce z* = (z? — 1) + 1, and in the third 
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z = (z — x) + z, so that 
— 1y- — 1st 
cnf E Bid raf E 1) De-i § Ea E ak =0 


Now the first term appearing here may be identified by means of (37), the 
others by (31). Thus, after simple rearrangement, 


(+ Pul) — Ql + DePi(z) + Pri) = 0 (3-39) 


The remaining relations are derived by differentiation and elimination 
among formulas (38) and (39). Thus, when (39) is differentiated with 
respect to x and Pj}; is eliminated by means of (38), 


zP (x) — Pi_y(v) = IPi(2) (8-40) 
Finally, the reader will have no difficulty in proving, by eliminations among 
(38), (89), and (40), that 
Pizi(2) — Pt_a(w) = (21 + 1)Pi(2) (3-41) 
and 
(z? — 1)Pi (2) = lxP;(z) — Pa (2) (3-42) 


It may be remarked that the recurrence relations here derived are also 
correct for Legendre functions having non-integral indices Z, although the 
above proof does not indicate this fact explicitly. 

3.6. Associated Legendre Polynomials.—The associated Legendre 
polynomial has been defined in sec. 2.11 as 


PPE) = A — 22)? ZE Pila) (3-43) 


This definition is meaningless except when m is an integer not smaller than 
zero. In the present discussion, this will always be understood to be the 
case. 

Recurrence Relations. We first derive the more important recurrence 
relations between these functions, which, as was shown in 2.11, satisfy the 
differential equation 


or aP? m? 
The function P{” = (d"/dx”)P; was seen to be a solution of 
z2 d? (m) d (m) 
(1 — 2) 55 PIP — 2(m + De = PI 


+ {i +1) — m(m + 1)|P(™ =0 (3-44) 
When this equation is multiplied by (1 — z?)”/? and the definition (43) is 


107 ASSOCIATED LEGENDRE POLYNOMIALS 3.6 


used, it may be written 
z 


VI = z? 


or, on replacing m by m — 1 


Prt? — 2(m +1) 


PPH + (+ 1) — m(m + IPF = 0 


mtl _ — -1 _ 
P; 2m VE PP +CH 1) — mm — 1) Pr =0 (38—45) 


This represents the fundamental relation between three associated Legendre 
functions with equal 1 but consecutive values of m. 
To get a similar relation for equal m but consecutive 7 we return to 
eqs. (39) and (41). Differentiating the first of these m times we have 
(+ 1) PI, — (21+ DP — (A+ mP”? + 1P™, =0 (3846) 
When (41) is differentiated m — 1 times, the result is 
PIR, — PM = (21 + 1) Pi? (3-47) 
On eliminating P{"~? between (46) and (47) we find 
(A + 1)2P§™ = (+ mP + (lL m ++ IPR} 
When this is multiplied by (1 — z”)”/?, there results the desired relation: 
PP = (A +I + m)Phit+ @-m+1)Phi] (3848) 
Two “ mixed ” relations, in which both | and m have different values, 
are often useful (cf. Chapter 11) and will now be derived. One is at once 
obtained from (47) when that equation is written with m replaced by m + 1 
and then multiplied by (1 — 27)°"+))/?; 
(1 — PP = (2+ OPRET — Pry (3-49) 
The other can be deduced from eq. (45). When zP? in (45) is eliminated 
by means of (48) it reads: 
2m 


Hoon Mn 
Pr (1 — x?)1/2(2] + 1) 


[0 + m)Pi1 + @ — m+ 1P 
— EU + 1) — m(m — 1) PP 


Here PP ~! can be expressed in terms of Př, and P7_, by means of (49) 
(written for m — 1 instead of m). When this is done and the terms are 
collected, we find 


(1 — PPPH = (A+A mEt m + DPP 
— (l— m) (l — m + 1) PRI (3-50a) 


For convenience in later work this may be written in a form similar to (49) 
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if only m is replaced by m — 1. Thus 
(1 — PP = (21+1) d+ ma +m — PrI 
= (l -m+ 1)(l— m +2)PRI] (8-50) 


Tt is seen from (49) and (50) that V1 — z? PP” can be expressed in terms 
of Pmt? and Př Ti , as well as Př! and Při’. Relations (48), (49), and 
(50) are used in calculating quantum mechanical matrix elements in 
central field problems (cf. sec. 11.13). 

Integral Properties. It is desired to evaluate the integral over the 
product of two associated Legendre functions having the same index m. 
If we use the definition (43) together with Rodrigues’ formula (30) we have 
f) PEOP = Soe S. xn E yt 2 Kde (851) 

i t L PARIA] T! _ dzitn dz Fm 
where X = z? — 1. As was done in connection with the Legendre poly- 
nomials, we again carry out a sequence of partial integrations, V +m in 
number, in which all the integrated parts are zero. The integral in (51) 
then reads 


Vem qim qim l y 
cyte f d[e ier] re 
, , Vv +m qutm— derma 
= car fi x"E( N |ie (X") gar (Xde 


Now the term of highest power in X” is x°”, that in X’ is 27", Therefore 
every term in the summation over à will be zero unless, simultaneously, 


U+m—rASQm and l+m+r%S2l (3-52) 


The first of these implies: A = 7 — m, the second A S1—m. Let us 
suppose that I < l’. Since m is positive, these two relations are incom- 
patible, and the summation contains no term which is different from zero. 
Hence the integral (51) vanishes. If 1 > I’, it must also be zero because 
the integrand is perfectly symmetrical with respect to Z and I’. To show 
this result explicitly, the partial integrations must be performed in the 
reverse manner. f 

If I’ = 1 the two relations (52) are indeed compatible, but only for the 
single value A = 1 — m. Hence the sum over à contains only one term, 
and the integral becomes 


core ft") Sam & ae 


lm 1 
= (—1)H . l 
(—1) I ) (2D)! (2m)! J. X‘dz 
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The remaining integral has already been computed (preceding eq. 35); 
it was found to be 
(—1)'g2/41 (L!)? 
(2l + 1)! 
L+ m) +m)! 
L—~m/ (lL—m)! (2m)! 
ing the various factors, 


* — (-1)" @tm)!(-) (— 1912471)? 
ETE ODOM Bee 


On the other hand, ( Thus we find, collect- 


„tm! 2 

(=m! +1 
It has thus been proved that 

1 (+m)! 2 

J PR@PR@d = FN aah (3-83) 

If z is taken to be cos 9, this result may be written in the equivalent form: 
7 . dtm! 2 B 
Í P? (cos 0)Pr (cos 6) sin 8d = 0 mji Tm)! ALI +1 ôv; (8-53a) 


3.7. Addition Theorem for Legendre Polynomials——To prove the 
famous addition theorem for Legendre polynomials (eq. 61) it is necessary 
first to establish a formula due to Heine. If we substitute the Schlaefli 
integral, eq. (31), for P; in the definition of P? (eq. 43) and carry out the 
differentiations with respect to x under the integral sign, we have 


= 
PR(a) = 5 CHDOCH): C+ mA ~ am? x 


fe — 1} (z — zy dz 
Now let z = z + Vx? — 1 e" and integrate over ¢ from — x to rin accord- 
ance with the meaning of the contour (cf. eq. 31 et seq.). Then 
- ee (+m) (1 — 2"? x 
f [r + Vr? — 1 cos ¢!! 
—r [Vv x — 1 lid ki 


(— 1)™/3 x 


PP(z) 


_— @4+1)04+2)--- +m) 
7 2r 


f [z + Vz? — 1 cos pl'e tdo 
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pre) - GED +2) Ce mn? y 


T 
S [z + Vz? — 1 cos ¢]’ cos modo (3-54a) 
0 


In taking the last step we observe that, of the two constituents of 
e’”> = cos mo + i sin mọ, the first is an even, the second an odd function 
of ọ. Since the other remaining factor of the integrand is even, only the 
cosine part of e*”? will give a finite integral, and this has twice the value of 
the integral between the limits Qandz. Eq. (54a) is Heine’s formula. 

If in the differential equation for Pf’ (x) we substitute —I — 1 for l, the 
equation remains unaltered. Therefore P™,_; = Pr. In view of this, 
Heine’s formula may also be written 


-)(— SOOP sll 
PP(t) = Pia EDIt) llt mOr 


kig 
f [z + Va? — 1 cos o] 7 cos medo 
0 


„UoD Gam + DOD y 


T 

T cos mode 

——— 3-54b) 
J, [2 + V2? — 1 cos ¢]t? ( 


To prove the addition theorem we consider the equation 


< [zi + (i — 12 cos (w — a)}} 

i=o [ze + (2% — 1)? cos a]’+? 

= {x2 + (23 — 1)"/? cos a — pity + (z? — 1)"/? cos (w — a)l}? (3-55) 
which is an identity for sufficiently small values of the parameter p. All 


other quantities appearing in (55) are supposed to be real, but otherwise 
unrestricted. This relation is simply an application of the expansion 


Er = (1-2), |z| <1 
i=0 


Let us integrate eq. (55) over da from —x to m. The integral on the right 
may be evaluated by means of the formula 


f de = 2r (a? — b2 — eye 


,atbeosa + cesme 
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which was proved as an example on page 93. Here 

= To — Pry 

(eg — 1)? — p(a? — 1)? cos œ 
-p(s — 1)? 


ao of 8 
H 


sin w 
Hence the right-hand side of (55) becomes after integration 
2r 1 — 2plrie2 — (2? — 1)? — 1)? cos w] + p? 


As will be seen forthwith, the expression in [] appearing here has a very 
simple geometrical meaning. For the present, let us designate it by z: 


r = tirto — (2? — 1)? (z3 — 1)!” cos w (3-56) 
The result of the integration may therefore be written 
2r(1 ~ 2pr + p 


But by the theorem on Legendre functions, eq. (24), this is 
2r E p'Pi(x) 
1=0 


The left hand side of (55) may be integrated term by term because the 
expression is assumed to converge. On comparing coefficients of p? we 
see, therefore, that 


Pi(x) = 


mf, 2 1/2 — l 
>f [cı + (ri — 1)? cos (w — a)] da (3-57) 


Qn J, [eo + (23 — 1)? cos alt 


The last step of the proof involves an expansion of P(x) in a Fourier 
series.” Clearly, P;(x), being a polynomial of degree | in cos w, can be 
expressed in the form 


i 
Pilz) = $e9 + E Cm cos mw (3-58) 
m=! 
The coefficients cm are given by 


1 T 
Em = -f P(x) cos mwdw 
TS ag 
=} f da f do ELT E = 1)? cos (w ~ a) os mo 
Qn? J, [te + (23 — 1)1/? cos a} T! 5 


when P;(x) is replaced in accordance with (57). In this integral, we may 


5 See sec. 8.2. 
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introduce the variable w - a = 8 in place of w, so that 


1 dex 
= aC TT EET ST — 1)1/2 
™ sd meee cos wS. [zı + (f — 1)™? cosg] x 


(cos ma cos mB — sin ma sin mp )d8 


The integration with respect to 8 over the term containing sin mg is obvi- 
ously zero because the integrand is odd. The other term can be evaluated 
by means of eq. (54a). The result is 


sf cos mada ; T Qa (— y 
Qn? J _ [x2 + (25 — 1)1 cos af’? (1+ 1)0 + 2)--- +m) 


In the remaining integration over a we use (54b), obtaining 


P? (21) 


em = 2 PHPP Ce) 
Hence from (58): 
Pi(z) = Pi21)Pi(2:) +2 E Tamar PrEP ea) cosmo (8-59) 


which is the desired addition theorem. 

Finally, let us investigate the meaning of x defined in eq. (56). If 
61, @; and ba, g2 denote, respectively, the polar and azimuthal angles of 
two lines passing through the origin, then O, the angle between these two 
lines, is given by 


cos © = cos 6, COS 62 + sin 6; Sin 69 cos (y1 — g2) (3-60) 
Thus, if in (56) the following identifications are made: 
z= cos ©, Tı = COS f1, T2 = COS bz, w = 91 — p2 


then (59) becomes 


P(cos 8) = Pi(cos 81) Pi(cos 82) +2 x! f- 7 iom 
Př (cos 81) Pi (cos 62) cos m(y, — p2) (3-61) 


In quantum mechanics (cf. sec. 11.12) it is convenient to use associated 
Legendre functions which differ from P? (x) by factors depending on | and 
m, but constant with respect to z and so chosen that the integral over the 
square of the functions is unity. These functions are called “ normalized ” 
associated Legendre functions. They will here be denoted by NP. Let 


us put Uf'(z) = NamP(e). ‘Then if we wish f [17 (cos 8)]? sin 8d8 to 
Q 
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be equal to unity we must, in view of (53), put 
— p 7/2 

Nim = É +1(—m)! 

2 (+m)! 


It is also customary to permit the index m of II? to be negative and to 


define 
AHi] ye 
np) = [Z TP) (3-62) 


The index m may then take on all integral values including zero from —l 
to +1, while l is always a positive integer. In terms of the functions I7, 
the addition theorem takes a particularly simple form: 


Jt l 1? (eos 0) = I n (cos 6; IIF (cos 62) cos mli — ge) (3-61a) 


One may also replace the factor cos m(p1 — ge) in each term of this summa- 
tion by e eTe, because each pair of terms corresponding to +m and ~—m 
then yields a cosine function. 


Problem: Express eqs. (48) to (50) in terms of U-functions. 
3.8. Bessel Functions.—In sec. 2.14 we have shown that a particular 


solution of Bessel’s differential equation, (eq. 2-57), is the “ Bessel function 
of order n,” defined as (ef. eq. 2-60) 


i (-1)* n42 
7G) = =E tad DIA +n + 5 ) (3-63) 


It is of interest, first, to note that for integral n, J,,(z) is the coefficient of u” 
in the expansion of exp [(7/2)(u — 1/w)]. In fact J,(x) for integral n, 
called Bessel’s cocfficient, was originally defined by means of this relation. 
To prove it we merely expand the exponential, using the binomial theorem® 
to express (u — L/w)’: 


SEEI ROEI Err 
(=P () 


va (2 ~ VEX 


If we now put v — 2) = n, this becomes 


ole Ari RO] 


ê The binomial theorem states: 
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For integral n, the bracket appearing here is identical with the expansion 
(63); hence the above-mentioned theorem is proved. 

From (64) an integral representation may be derived quite simply. 
By the theorem of residues (eq. 3) the coefficient of z' in an expansion of 
J2) is given by 


1 
a = Oni fro dz 


the integral being taken in a counter-clockwise sense about z = 0. Simi- 
larly, the coefficient of z” in the expansion of f(z) will be: 


1 fI 


n= ; ` 
Prid zt! 


dz 
The theorem just proved is therefore tantamount to the relation 
1 
Jn = — $ —n—i (2/2) (uml ju) g 65 
(x) a pee u (3-05) 


It is customary to write this result in a slightly different form, obtainable 
on replacement of the variable u by 2t/x. Eq. (65) then reads 


G $ rr ex [ _ | it 3-66 
oni \2 PIT di (3-66) 


Jnl) = 5 


While this integral has been shown here to be identical with the convergent 
sum of eq. (63) only if z is an integer, a more special consideration 
would indeed establish the equivalence of (63) and (66) for non-integral 
n also. A simple way to prove this faet is to show, by substitution, that 
(66) satisfies Bessel’s differential equation. On performing the differentia- 
tions indicated on the left of eq. (2-57) and substituting therein, we find 


grt? i n+i a ( r? 
—— -ai On Oe = —— id 
2rtlri $ ' l t Tg n) a 
grt? d x 
= a por ` L i= 
Ptit $ dt | exp (: A| d = 0 


because the integral around a closed loop of an exact differential is zero. 
We may therefore regard either (63) or (66) as a definition of Ja (x£) for both 
integral and non-integral values of n. For non-integral n, however, caution 
is required in the choice of the contour of integration in (66). This must 
clearly enclose the origin. But if we were to take, for example, a circle 
about the origin as center we should encounter a diffculty. For non-inte- 


T The differentiations may be carried out in (66) without regard to the fixed path of 
integration, that is, “ under the integral sign.” 
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gral n the integrand is a many-valued function of t. Thus if the amplitude 
of t should vary, say, from —7 to +r, the integrand will not have performed 
a closed loop® and the last equation above would not be true. It is neces- 
sary, therefore, to select a path of integration which (a) encloses the origin 
and no other singularity of the integrand; (b) starts and ends at a point in 
the t-plane which will cause the integrand to perform a closed loop also. 


Fic. 3-6 


Such a path is that illustrated in Fig. 6. Whenever eq. (65) or (66) is used, 
we shall understand that the contour integral is taken along this path. 
(The reader familiar with the theory of many-valued functions will observe 
that this path confines the integrand entirely to one of its branches pro- 
vided that the argument of t is given its principal value.) 

Recurrence Relations. From (66) one may show by direct differentia- 
tion that 


d 
an [Jala] = ~en (2) (3-67) 
or, when the differentiation on the left is carried out, 
Tht) = = Jal) — Ings (@) (3-68) 


To obtain the other fundamental recurrence formula we perform the 
differentiations in the equation 


d x ] 
a ~~ )\la =0 
$ di G exp ( i) 
The result is 


x 2 l r? 2 
n — neg na| — — = 
f(e +7 C nt ) exp (i a) t=0 


8 As an example of many-valued functions, consider Vi = Ve = pel? When 
i moves along a circle of unit radius from —* to ++, initial and final points are identical. 
But Vi has the initial point ei*/2 = —i and the final point e™/? = +i. 
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When use is now made of the definition (66) this reads 


far 2n 
PA (=) {Jus + Jn — dn = 


hence 
2n 
Jami l2) + Jng (£) = ps Ja (2) (3-69) 
On eliminating J,,; from (68) by means of (69) there results 
Jia) = Ja (2) — Z Jala) (3-70) 
and from this and (68) 
Jiz) = Faile) — In4i(Z)] (3-71) 


Bessel’s Integral. Let us consider J,(x) as defined by eq. (65): 
Jn(x) = L $ uer ud w) dy 
2r: 


We may free ourselves from the condition that n be an integer by choosing 
a path like that of Fig.6. More specifically, the contour will be taken to 
start at — œ, pass to the right below the real axis (u = te", © >t 2 1) 
up to the point —1, then perform a circle of unit radius in a counter-clock- 
wise sense about the origin (u = e, —r < 8 < r) and finally to return te 
— œ above the real axis (u = te, +1 Si < +). The contour integral 
then becomes 


1 . ° 1 
Jn(z) = ae f Erl exp 5 (- + *) dt + 
1 A7 _-niotizsind i _ j te x I 
2s — dð — — (n+l)ir n=l -f — — 
on J, € Oni é a E exp 3 t+ 7 di 


The second of these integrals may be written 
{ x 
- f cos (né — x sin 6)dé 
x Jo 


because the odd part of the exponential, sin (n0 — x sin 6), vanishes on inte- 
gration between —r and +r. The first and last may be transformed by 
putting ¢ = ef and noting that 


ef — e = 2Qsinha 


When they are combined, the result is 


1 . v . = : 1. a , 
— fe(nt Die —é ‘aH ir] gris ainh @ dé = — -sin nx f pnis sinh 8 gg 
Dri a T 0 
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Hence 
Jala) = if cos (nd — z sin 6)dé — 
T (Jo 
sin nr f exp (—76 — z sinh 6)d0; (8-72) 
0 


This is a generalized form of Bessel’s integral, derived by Bessel for integral 
values of n. In that special case the second integral vanishes and 


Jn(z) = : f cos (n8 — z sin 6)d0, n = integer (3-72a) 
0 


Bessel Functions of Half-Odd Order. When n is half an odd integer, 
for instance, p + 4, J,(x) takes a particularly simple form and is related 
closely to the trigonometric functions. To show this, let us first compute 
J1)2(2) by the expansion (63). We may then use the recurrence formulas 
to obtain J3;., etc. Thus, 


1/2 —1) 2d 
Jija(t) = (5) Emrg ey 


But in view of eq. (5), ete., [T(z + 1) = zT (z)] 
2+1 2—1 
+i, .- 3r) 


2 2 
(+1)! (+1)! 


9A, 1 rg) = Q2r+1) 1 


r$ +A) = 


When this is substituted in the series for J1;2(x), there results 


ANY  2(—1)*2 2N 12 a (1ra 
10 =G) Lace GG) Parner 
= (2)" sin x (3-73) 
From (67), 


This process may be continued if the explicit form of the functions J p}1/2(£) 
is desired. A general formula is readily obtainable as follows. Eq. (67) 
may be written 


d 
galt) = = FEL = 25m [eae] 
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or, by repeated application of (67) 


Sala) = (2)? gag eae (3-74) 


Hence, on putting n = $, 


d? 
Josie) = get 2)? ya [TYI E) = 
(—1)P (2r)P tY? aP = 
ail? d(x?)? \ x ) (3-75) 


The first few functions of half-odd order are given below. 


7 rT 
p? JZ J pie) JZ J p—142(@) | 
0| sinz cos t 
sin x . cos x 
1 —— — cos © —sin t — 
x 
3 . 3 3. 3 
2 371 sin x — - cos © -sin z +{-; — l }cosz 
x x x x 
15 6\. 15 15 . i5 6 
3 3G sin g ~ moi cos x — po sin z ~ Bog cos x 
Zz 
105 45 . 105 10 , 
4 -~ t 1 jsinz sin x 
x x 
105 10 105 
| — Ze cos x on Or Oa cos x | 


When the differentiations in (75) are carried out it is easily established 
that the asymptotic form of Jp41/2 18 given, for all p, by 


, 2 1/2 r 
lim Jpp1;2 (£) = (2) sin (z — p3) (3-76) 


3.9. Hankel Functions and Summary on Bessel Functions.—The 
Bessel function J,,(z) is only one particular solution of Bessel’s differential 
equation. However, as was noted in sec. 2.14, J_n(x) is also a particular 
solution, for the differential equation is insensitive to the substitution of 
—nforn. Hence a general solution of the form 


y = aJn(z) + bJ_»(z) (3-77) 


is at hand provided J, and J_, are different functions, i.e., are linearly inde 
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pendent. As was also shown in sec. 2.14, this is true as long as n is not an 
integer. The Hankel function, frequently used in physical problems,” is of 
interest only in connection with non-integral n and the following remarks 
are restricted to that case. This function is a solution of Bessel’s equation 
of the form (77) with the constants a and b suitably chosen. We distin- 
guish two kinds of Hankel function, generally denoted by H{? and H®: 


a 


HY = —— [i], (2) — Jaala) 
sin nr 
, (3-78) 
H® =- peiJ (2) — J-a l2) 
sin ar 
Hence, conversely, 
Jala) = HHP @) + HO 
Jin(e) = 3e HP (2) + HP (2) 


These definitions hold, of course, for complex as well as for real values of the 
argument. Hankel functions are particularly useful for complex argu- 
ments, for they vanish strongly when the modulus of the argument 
approaches infinity, which is a requirement in many physical problems. 

The qualitative properties of Bessel and Hankel functions may be 
summarized in the following brief survey. 


A. J„(z) is real if z is real, complex if z is complex. 
1. Atz =0, 
lifn=0 
Oifn>0 
Oif n < 0, and n is an integer 
| æ% if n < 0, and n is not an integer 


J (x) = 


2. At z— œ, all J a(z) oscillate, but with ever-decreasing amp- 
litude (provided z is real). 


2 ™\.. 
+4/—cos(a— —-)if nis even 
TE 4 


2 
| ay/Zsin (2-3) if n is odd 
Ze 4 


5 See, for instance, Stratton, J., “ Electromagnetic Theory,” McGraw-Hill Book Co., 
1941. For applications in: propagation of radio waves, ef. Sommerfeld, A., Ann. der 
Phys. 28, 692 (1909); theory of optical diffraction, cf. Wolfsohn, G., Handb. d. Phys., 
XX, p. 282; quantum mechanics, ef. Margenau, H., Phys. Rev. 46. 613 (1934). 


lim Ja(z) = 
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B. H,(z) is complex if z is real, but 1H (ix) and COD H® (—iz) 
are always real if z is real and > 0. 
1. Atz = 0, both HP and H® become infinite. In fact 
l n= 1)! f2\" 
üm "HHP (iz) = lim OP HP (iz) = QD! E 
z0 r—> 0 T r 
2. At z— o, either H (2) or HË (z) vanishes exponentially. 
0 if the imaginary part of z > 0 


H 


lim HP (2) 
Jz] —> œ 


æ if the imaginary part of z < 0 


æ% if the imaginary part of z > 0 
lim A?) (z) 


|z| —» œ 


0 if the imaginary part of z < 0 


The behavior at infinity of both J, and H, is most easily remembered by 
noting the general similarity between 


H(z) and e” 
HO (z) and e” 
Ja) and 4(e* +e) = cosz 


The important difference between the Bessel functions and the circular 
functions is in the fact that the former have neither constant amplitude nor 
constant wave length. 

Useful Formulas Involving Bessel Functions. We conclude the discus- 
sion of Bessel functions by appending here a list of formulas involving 
Bessel functions. Some of these are easily proved with the use of the 
theory here developed; to establish others reference should be made to 
more comprehensive treatises, such as that of Nielsen!® and that of Gray 
and Mathews." An extensive table of differential equations having Bessel 
functions as solutions is given in Jabnke and Emde.!? 


[Jo(z)| <1, |J æ) |< VE for n2=1: areal 
[Jo(2)]? + 25 ae = 1 


2 sin nr 


Jn(z)Jn—1 (2) + Joni (2) Fn (2) = 


10 Nielsen, N., “ Handbuch der Theorie der Cylinderfunktionen,” Teubner, 1904. 

11l Gray, A., and Mathews, G. B., “ A Treatise on Bessel Functions,” Macmillan. 
London, 1922. 

12 Jahnke, E., and Emde, F., “ Tables of Functions With Formulae and Curves,” 
Dover Publications, New York, 1943, pp. 146-147. See also Kamke, E., “ Differential- 
gleichungen, Losungsmethoden und Lösungen,” Bd. 1. Third Edition, Chelsea Publish- 
ing Co., New York, 1948. 
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mh 


q bynes (2) Ina (2) — (Jn(x))?] 
ete mJ (a) Pda 


provided is a positive integer and m + 1 > 0 


Jn—1(z) He (£) — Jaa) HP (2) = HO (2) Ja(z) — HE (2) In1 (2) 
2 


miz 


Jazi + 22) -(1+ ay ES (“ie (1 +32) Jan (2) 


f Jalada = 2 E Jai (2) 
0 A=0 


S Japi (2) (2da = 


u 


z r? 
f 2[Jn(ox) Pax = F {[Jn(az)]? — Jai lar)ni (aa) } 
This formula is also valid when all J’s are replaced by H™ or H®. 
m-+ 1 
r ( 2 ) 


m— 1 
T ee 
(« 2 


f arm] larde = ranm 
o 


f2n+1l>m> -l 
9\ 1/2 
lim Jaiz) = (=) x 


lel => © 


2 — 
[el F-5)-3 * sin -2-3 +o(3 A 
f 2J,(or)J,(Bz)de = Bad (at) In—1(B2) — at) n_1 (art) Jn (Bx) 
o 


a — p? 


3.10. Hermite Polynomials and Functions.—In sec. 2.15 the Hermite 
polynomial of degree n has been defined as the polynomial solution of 
Hermite’s differential equation 


" _ Qny! + 2ny =0 (3-80) 


13 The notation O(1/z?) is to be read ‘‘ terms of the order 1/22.” 
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Such solutions were seen to exist when n isan integer. Explicitly, y = 


n(n — 1) 


H, (2) = (2x)" — (22) 2 + 


n(n — 1)(n — 2)(n — 3) 


zi (21) —.-- (8-81) 


We shall now find an equivalent expression for #, in terms of a definite 
integral. If we put 


Yn = = $ gole e—a) dg (3-82) 
2ri 


and take the contour around a circle which has the origin as its center, then 


ae = — >$ Qe her ee) dg (3-82a) 


and 


-$ dg —n+l et (ez)? yg, 
T Imi 


The differentiations here may be performed under the integral sign. When 
these derivatives are substituted on the left of the differential equation (80), 
it is found that 


Yn = 2ryn + nyn = = $ (4z? — 4rz + Qn)e*— (2-#)* 2-1-1 
HL 
2 d 
= — — — mn gr ma) g, = 0 
said dz (ene de 


The last step follows because the contents of the parenthesis, being a 
single-valued function of z, if n is an integer, takes the same value at the 
initial and final points of the contour integration. It has thus been shown 
that expression (82) is also a solution of Hermite’s equation.'* Since it 
represents a polynomial in z it must be identical with H,(x) except for a 
constant multiplier. This constant may be found by computing, for 
example, H,(0) from (81) and y,(0) from (82) for even n (since otherwise 
H,,(0) would vanish). Eq. (81) gives 


(—1)"2n! 


M The function yn defined by (82) is a solution of Hermite’s differential equation even 
when n is non-integral, but in that case the contour must be specified differently: to 
make the integrand return to its original value, the path must start at +, go in toward 
the origin, encircle it in a counter-clockwise sense and return to + œ, 


H,(0) = 
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while from (82) we obtain 


Yn(0) = + $ grle dz = 
2r? 


by the theorem of residues (eq. 3). Hence we see that H, is n! times yn: 
ni 2. 2 
H(z) = $ grier 2) yy (3-83) 
Dri 


This result may be expressed in a different way. On examining (82) in the 
light of the theorem of residues it is apparent that y, is the coefficient of 
z” in the expansion of e” ~~)’ as a power series in z. Hence 


ee Pye E re z” 
n=0 n n! 


(3-84) 


Recurrence relations between Hermite polynomials of different degree 
are easily derived. The first is implicit in eq. (82a), which may now be 
written y/ (2) = Zyn (2), or 


Hy (a) = 2Ha (2) (3-85) 
The second follows from the differential equation 
Hi’ (2) — 22H! + 2nH, = 0 (3-86) 


Others may be derived ad libitum by repeated application of (85): 
Hy! = 4n(n — 1)Hn ete. 


Thus far two representations of H, have been obtained, the series form 
(81) and the integral form (83). A third may be deduced from (84). Let 
us take the n-th derivative with respect to z on both sides. The left 
becomes 

2 a” 2 2 a” 2 
er — eT @-2) = e” (-1)*— E ETD) 
ðz” ox” 
and the right simply 
A, (2) + Anyi (ze + see 


These two expressions are equal for all values of z. On putting z = 0, there 
results 


n oT? a z? i 
Haa) = (—1)"e* 6 (3-87) 


A function closely related to the polynomial H,(z) was introduced in 
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sec. 2.15. Itis 
y = & PH, (x) (3-88) 
and satisfies the differential equation 
y” + (1 — 2? + 2n)y = 0 (3-89) 


The function defined by (88) is called the Hermite (orthogonal) function. 
It is of interest because it appears (cf. sec. 11.11) as an eigenfunction in the 
quantum mechanical problem of the simple harmonic oscillator. We shall 
here derive a few integrals involving this function which will be found 
useful later. 

The first is the integral over the product of two Hermite functions, 


f ” OH, (2)Hy (ade 
In view of (84) 
glama)? | gt* (are)? (2% Hala) 2 A(z Ho H(z) 2t) 
` 


Hence, multiplying each side by 7 and integrating 


hu v i 


The integral on the right has the value!® Vre”, 
This may be expanded to read 


22122) 2 
Vrela = Vr x Cat = Vi 2 ya Ôh p (3-91) 
l m : 


where the single summation over A has been changed into a double summa- 
tion over à and u by the artificial use of the Kronecker 6-symbol, defined 
on page 104. Since (90) is true for every value of zı and zz, the individual 
coefficients of every power of z, and zx in both expansions must be equal. 
On comparing (91) with the left side of (90) we see that 


e eH) H,dz 
J Alu! = Ves ae 
or 


f eH, (2)Hm(x)de = 20! Viinm (3-92) 


15 In evaluating it, use is made of the formula: 


° T 
f ge tdir = € Pile 
-o a 
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The integral 


w 


f 26H (2) Hm (cde 


can be evaluated in a similar way. In place of (90) we now write 


i ` © 
x Lf xe" Hy ()H, co) T f wet ea) lama) de 
-o lu! -o 


Au 


= Varla + z_)e?1% 


The last result is, on expansion, 


PAHIA argat] 
vz ( 1 23 129 ) 
pa +k 


A!l x A! 
Diag Doret 
= ¥a( ne, 2128 ) 
x Z a pie + x Tice. 


Equating coefficients of zjz3' then yields 


© 


J 6TH, (2) Hy (olde = V2" nm nt + 2"(n + 1) mnp) 
(3-93) 


The integral vanishes when n = m and also when n and m differ by more 
than unity. The same method may be used to calculate other integrals 
of the type 


f eH, (2) Hn (addr 


Later, however, we shall Jearn of simpler ways, involving matrix algebra, 
for deriving these from the result established in eq. (93). (See problem at 
the end of see. 11.17.) 

Example. A simple harmonic oscillator, if treated by the methods of 
quantum mechanics, has a distribution of mass about the attracting center 
which is given by 


P(t) = co PIHEN 


where £ = VBz, 8 being a quantity characteristic of the oscillator, and n is 
a quantum number which depend» on the total energy possessed by the 
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vibrating point. The moment of inertia of this mass distribution is given 


by 
= me =m fT xe PHE pas f f7 e EIH (Edr 


<a). eH, (E) Pat / F cPLA, (EAE 
The integral in the denominator has already been calculated (cf. 92) and is 
equal to 2"2!,/7. The integral in the numerator may be computed by the 
same method and is found to be 


2 1 — 
nET om Vz 
2 
Hence 
3 mint 
8 2 
Later (cf. sec. 11.11) it will be shown that 8 = 4r°mvo/h so that 
2 49 


where vo is the ‘ classical ” frequency of the oscillator. 


LIST OF HERMITE POLYNOMIALS 


Hy (§) =1 
Hy, (€) = 2 
Hy (E) = 4? — 2 


Hy (£) = 88 — 12 

Hy (E) = 164 — 4827 + 12 

Hs; (E) = 326° — 1608? + 120 

He (£) = 648° — 48024 + 720¢? — 120 

H; (E) = 10857 — 134485 + 33608 — 1680: 

He (E) = 2568 — 358428 4+ 1344024 — 124402? + 1680 

Hy (E) = 51289 — 921627 + 483845 — 906408? + 30240E 

Hol(g) = 102410 — 2304028 + 1612806 — 4032004 + 302400e2 — 30240 

3.11. Laguerre Polynomials and Functions.—The theory of Laguerre 

polynomials may be developed along lines very similar to those of the last 
section. A Laguerre polynomial ZL, (x) bas been defined in sec. 2.16 as the 
polynomial solution of Laguerre’s differentia] eq. (2-67): 


zy’ + (1 — zry + ry =0 (3-94) 
It exists whenever n is an integer and was found to be 


2 Qin  1\2 
La(t) = nehe — Tam owas a 1) r 4.. (-1)"™! ) 


(3-95) 
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We first establish a representation of L, in the form of a definite integral. 


Consider 
1 gl — gz 
n= d 3-96 
Y afe) 7 ( ) 


where the contour is taken to include the origin. Differentiations with 
respect to x may be performed under the integral sign; hence 


, 1 f z” ( = dz 
pn = v : e 
y mid (1 — 2)? XP l-z 


1 goti a 
” f 
I a O e — ja 
Y 2t (A 7 


On substituting in the left-hand side of (94) we find 


H a2? G- xz = ( =) 
afla- P A a) @ 


But this is easily seen to be 


LFE] 
2rt dli—-z \i—2z 


an expression which vanishes because the quantity in brackets takes on the 
same value at the initial and final point of the contour. Hence (96) is a 
solution of Laguerre’s differential equation. Moreover, it is a polynomial, 
as an analysis in the light of the theorem of residues will show. Its rela- 
tion to L,(x) may be established by computing both y,(r) and L,(x) for 
a particular value of x, say zero. From (95) 


and 


E,(0) =n! 
from (96) 
0) = 5 fa =~. frm de4+24 )dz =1 
m Omi 122 On are z= 
Therefore 
La = N! Yn 
Again, using the theorem of residues (eq. 3), we find, since 
n! pg rt ( —x2 
la = oS Toe PS ;) dz (3-97) 
that 
paai o (> Ln 
(1 — z) exp 5) -Eyr p ean (88) 
l-z n=0 n=o N! 


This result is quite similar to (84). 


3.11 SPECIAL FUNCTIONS 128 


Next, we turn our attention to the recurrence relations existing between 
Laguerre polynomials of different degrees, and between the derivatives of a 
given polynomial. A relation of the latter type follows at once from the 
differential equation: 


chy + (A — x)Li tnn = 0 (3-99) 
The former relation may be obtained by differentiating (98) with respect 
to z: 
l—ax-2z ( =) Oog Ly (a) 
a-a “PU —2/ Soa 1! 
When the left-hand side of this equation is again expressed in terms of 
Laguerre polynomials with the use of (98), the result may be written 
Dy, hek 


L(a) 


a -= z- z) LA =l- +e) De Zt 
A — yy!" 
On equating the coefficients of z”, there results 
Lin Lpi _ cont 2L, Lonni 
A-D a n aD aa 
whence, 
(1 + 2n — x)En — Ln- — Lnyy = 0 (3-100) 


which is the relation here sought. 

For some purposes it is convenient to have L, in the form of a derivative. 
To find it we differentiate (98) n times with respect to z and afterwards 
putz = 0, thus ee 


e” lim zz P fa — z) exp Tal = L(x) 


The reader will be able to show without difficulty that 
lim a” — —i Zr |- a” —z 
0 fo 2) exp (= :) ant (xe) 


zT d” ngr 
Ln) =e Tar ("e") (3-101) 


Hence 


The associated Laguerre polynomial, of degree n — k, was shown in 
sec. 2.16 to satisfy the differential equation (2-70) 


ry” + (k+1— rip +n — hy =0 


129 LAGUERRE POLYNOMIALS AND FUNCTIONS 3.11 
and is given by 


y = Li = qa Lla) (3-102) 


On differentiating (98) k times with respect to z, it is seen at once that 


k — o yk 
Ł-) exp (=) = 5 B®)» (3-103) 


z r=k Al 


oa-hi 


A function of great importance in quantum mechanics is the associated 
Laguerre function, for it describes, in a sense to be discussed fully in Chap- 
ter 11, the motion of the electron in the hydrogen atom. It satisfies differ- 
ential eq. (71) in sec. 2.16, and was there shown to be represented by 


Une = € DILA (a) (3-104) 
Certain integrals involving this function are often used and will here be 
calculated. They are of the form 


Inm = f ALE (E)LE (2) - 2?de 
0 


where p is another integer which we shall take in this work to be either 
1,2, or3. Furthermore, our interest will be confined tof, ,. If we multi- 
ply eq. (103) in which z; is written for z, by a similar one in which z is 
replaced by ze, there results 
£ 225 
vNa=k Alu! 


—k— — r? TZ: 
= (z122) (1 — a — 22) * exp ( L- Z ) 
i- Zi i — Z2 


IRCE) L(x) 


Let us now multiply each side of this equation by e~72"'?—! and then inte- 
grate with respect to z. In view of the definition of I» », the result may 
be written 


= Az k kel 
ESE Toa a- 24) - z) 
ak Alu! 
° 1 1 
f ge tP— exp (1 — ~ )| dx 
0 1— Zi 1- 22 
Now 


wm 
f ettr dr = atr] 
0 


as may be shown by r-fold partial integration or from eq. (10). If we.put 
a=(1- ai) +a- za) * — 1 = (1 — a2) — zi) qa -— zg)! 
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we obtain, therefore, 
oe _ Gaz) (1 — 21)? 7 (1 — #2)? 
win a (1 = zy22)*¥? 


When the denominator on the right is expanded by the binomial theorem 
he k A- 1 
(1 — 222) *? = z( +p T ) (2122) 


yp eee the 
F (ktp— DDA! 


the right-hand side of (105) becomes 


(1 — a) ~ 29)? 7 x 


(k+p—1)! (8-105) 


(2122) 


— ! 
een. (ziza) T (3-106) 


Thus, in view of (105), In,» is simply (n!)* times the coefficient of (z2)z2)” 
of this expression. 


a. When p = 1, this is obtained by choosing that term of the summation 
in (106) for which k + A = n, that is A =n — k. 


Inn = (nh)?/(n — k)! (38-1074) 
b. When p = 2, (106) becomes 
k++) 
pCR feyh — ARHAN ABAN 4 (ez) 


The second and third terms in the bracket, in which zı and zg appear with 
different exponents, cannot contribute to I» n; the first terms contribute 
when A = n — k, the last when A = n — k — 1. Hence 


(n+ 1)! ni | 
= ! 2 
Fan = (nl) (n—k)! ° n-k-1)! 
i 3 
= a (Qn —k +1) (3-107b) 
c. When p = 3, the significant parts of (106) are 
(k++ 2)! 


[(zyza)*** + Alea PH 4 (2129) +2] 


>I 


terms with different exponents of zı and zz having been omitted. Conse- 
quently, 


(n + 2)! 4(n + 1)! ni 
_ q, he en _ J 
148 
= L Gn? — 6nk + K + 6n — 3k + 2) (8-1076) 


(n — ki! 
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Obviously, the same process permits the evaluation of J,,,, for any value of 
p. The quantities Inm for n = m are rarely needed, but can be obtained 
by this method also. 

Example. The electronic charge of the hydrogen atom is distributed 
about the proton as origin in accordance with the distribution function 


P(p) = epte LR (0)? 


as will be shown in sec. 11.13. In this expression n and I stand for the 
“ total” and “angular momentum ” quantum numbers which designate 
the state of the atom; cis a constant which is different for different states, 
and p is proportional to r, the radius vector: p = (2/nao)r. The propor- 
tionality constant depends on the quantum number n and differs for differ- 
ent states or the atom, a is the fundamental constant known as the first 
“ Bohr radius.” P(o), finally, represents the charge to be found within 
the spherical shell enclosed between p and p + dp. 

Let it be desired to find the mean value of 1/r and r for this distribu- 


tion. Clearly 
MEL 2 f° Peels 
a r — Puy o p 


igs: [> Poa 


The integral in the numerator is simply Jn42, np with k = 22+ 1 and 
p = 1, that in the denominator is also J,47, 24: with the same k, but with 
p = 2. Hence, using (107a and b) 


Similarly, 


” P(e)ar J ” P(p)dp 


9 
nao Inyinti(p = 3) 
2 Inplnga lP = 2) 
In view of (107c and b) 


- _ nao 6n? — 2+ 1) do ta 2 
r= on => [Bn +D 


with k = 27+1 


For the ground state of the hydrogen atom (n = 1, l = 0) 


rm! =a} and r= 
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3.12. Generating Functions.—A simple and powerful way of represent- 
ing functions of the more unfamiliar types is by means of generating func- 
tions, that is, functions of two arguments which, when expanded in a power 
series with respect to one argument, contain the functions to be generated 
as coefficients involving the other argument parametrically. Examples of 
generating functions have occurred in the preceding sections; they will here 
be exhibited once more for easy reference. 


1. Legendre Polynomials. 
(Q — Qay + yP = E P(x)? (Cf. eq. 24) 
i=0 


2. Associated Legendre Polynomials. 
(2m) (1 — gy” 
Fm1 — Qay + y?y t2 


= EPE 


This was not used in the text, but is easily derived by differentiation 
from (24) on the basis of (43). 


3. Bessel Functions (of integral order). 


exp EC — J| = x Jalu” (Cf. 64) 


4. Hermite Polynomials. 


exp [2? — (z — x)?] = 5 ae z” (CE. 84) 


5. Laguerre Polynomials. 


(1 — z)7 exp (==) = £ Enz) 2” (Cf. 98) 


~z 


6. Associated Laguerre Polynomials. 


(-1}(0 — 2) ( y exp (= ) =È Lala) n (cf. 103) 


z 
i—z l-z 


7. Tschebyscheff Polynomials. 


i — zy 2 n 
I yty % T,(z)y 
(Not proved in text, but Cf. eq. 2—54.) 
3.13. Linear Dependence.—A set of functions ¢1, p2- -øn is said to be 
linearly dependent when a set of constants, kr, ko---k,, not all zero, 
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exists such that 
Len = 0 


If this relation can be satisfied only by putting all ky equal to zero, the 
functions are linearly independent. 

A criterion for linear dependence is easily derived. We observe that 
the integral 


Iak) = f | Ehan) [Pde (3-108) 


taken over the range of x in which the functions p) are considered, cannot 
be smaller than zero. It will attain the minimum, zero, for specific values 
of the parameters ky. Now it will first be shown that, if J has a stationary 
value at all, this value must be zero. 

For this purpose, let us vary J, replacing every ky by (1 + ôk}. The 
result is 

I+ ôl = (1+ 6k)72 
and 
ôl = [26k + (6k)°|7 
Where J has a stationary value, ôl must vanish; but it is seen that 6/ 


cannot vanish unless J itself is zero. Therefore the stationary value of 
(108) is zero, and we may say that the conditions 


a al 
L 0o X=1,2,-:- 3-109 
ab, ake ” (3-108) 


are both sufficient and necessary for the vanishing of J or, what amounts to 
the same thing, for the validity of 

a her = 0 

1 


If, therefore, eqs. (109) have a solution other than the trivial one in 
which all ky are zero, the functions p) are linearly dependent. 

But (108) may be written in a different way. If we define the 
coefficients 


* 
Azu = f OPAT 


I = Lankku 
Au 


we have 


and eqs. (109) now read 
Lok = 0 Paak = 0 (3-110) 
b A 
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These are identical because apa = d,,, 80 that one is merely the complex 
form of the other. Now the condition that (110) shall have a non-vanishing 
solution kı, ke, ++- kn is that the determinant 


| arp | =0 


This, therefore, is the condition for linear dependence of the functions py. 
Conversely, if | any | =£ 0, the set of functions is linearly independent. The 
determinant | Ory, | is named after the mathematician Gram. 

A simpler test, applicable when the functions gi, ---¢n are differen- 
tiable n — 1 times within their range of definition, may be conducted as 
follows. If the functions are linearly dependent, 


Zino =0 


These n homogeneous equations may be regarded as determining the 
set of constants ky. It will be shown in section 10.9 that they possess 
solutions other than kı = kz = kz -> -kn = O only if the determinant of the 
coefficients of ky, called the Wronsktan, 


Pı 2 “Pn 

f 1 f 
Pı P2 Pn 
cece eee cence =0 
PTD ppd... pD 


For independence of the solutions, then, the Wronskian must not 
vanish. It should be stated, however, that the vanishing of this determi- 
nant is not a sufficient condition for linear dependence of the functions. 

3.14. Schwarz’ Inequality.—Let f and g be any two functions of x such 
that the integrals 


A= fjas, B = JPod, C= frai (3-111) 


exist. The integrations extend over any definite range of the variable z. 
Certainly the integral 


JOO ONE) + gade = A+ (BT BYTE 


in which à is to be considered as a real variable, independent of x, is always 
positive or zero (zero only when g is directly proportional to f) and hence 
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has no real roots in A. But the roots of AA? + (B* + B)\ + C are given 
by 


x 
_ +B 1 V (B* + BY — 44C 


= 24 "ZA 


They are real unless 
4AC = (B* + BY? (3-112) 


The equality sign here holds only when g = const. X f. 
The right-hand side of (112) is twice the real part of B. Hence, if f and 
g are real functions, the inequality becomes 


f. Pdz- f gda > ( f Jode) (3-113) 


which is one form of Schwarz’ inequality. 
For complex functions f and g, (112) may be modified. Write f and g 
in polar form: 


f(x) = py (x) ett) gl£) = pola) 
Then B = f pyp2e de, Since (112) holds for every pair of func- 


tions f and g (which have integrable squares), it must also be true when 
g is replaced by g’ = ge*'-®). But the substitution of g’ for g leaves the 
values of A and C unchanged while it converts both B* and B into 


forras = | B |, which is the modulus of B. Hence 


[pide fotos = | fods P (3-114) 


This is the more general form of the Schwarz inequality. Further gen- 
eralization to functions of more than one real variable is obvious. 
A relation like (114) is also valid for sums: 


EOE gta.) = | Lite |2 (3-115) 


For ordinary vectors U and V this is equivalent to 
U?V? = (U -YX 
Problem. Prove inequality (115). 
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CHAPTER 4 
VECTOR ANALYSIS 


4.1. Definition of a Vector—-A physical quantity possessing both 
magnitude and direction is called a vector; typical examples are velocity, 
acceleration, force and angular momentum; other quantities such as mass, 
volume, temperature and time, having magnitude only, are called scalars. 
It is customary to represent vectors by letters in bold-face type and scalars 
in taltes, so that A stands for a vector whose magnitudeis 4. This custom 
will here be followed. A vector may be indicated graphically by an arrow 
drawn between two points, tail and head of the arrow being its origin and 
terminus, respectively; the scalar part of the vector equals (or is propor- 
tional to) the length of the arrow and the direction of arrow and vector 
coincide, 

It is often necessary to locate a vector relative to a coordinate system, 
which may be done by giving the coordinates of origin and terminus. Let 
the selected coordinate system be the usual right-handed’ Cartesian one 
with three mutually perpendicular axes X, Y and Z so oriented that if the 
positive X-axis points towards the reader's right and the positive F-axis 
towards the top of the page, the positive Z-axis will point up from the page 
towards the reader. Let the coordinates of origin and terminus of A be 
(t1,41,21) and Cra, y2,22), respectively; then the three rectangular Cartesian 
components of A, relative to the axes X, Y, Z, are defined to he 


ds = 22-1; Ay = yey; Ag = 227 z 
The length of the vector is the distance between the two points: 
deaV4it+ A? +A 


The two points might be located relative to many other coordinate 
systems, one of which could be obtained from the previous one by rotation 
of the axes to X’, Y’, Z’ and translation of the origin from O to 0’ as shown 
in Fig. 1. Suppose the coordinates of O’ in the first system are (to, Yo20), 
then the position of the second system is determined with respect to the first 


1 Left-handed systems are sometimes used. They may bo obtained from right- 
handed ones by changing the direction of one of the axes, or by interchanging the names 
(or directions) of three of the axes. If two axes (or directions) are exchanged, the 
system remains unchanged. 


7a 
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when the angles between O'X’, O'Y’, O'Z’ and OX, OY, OZ are known. 
The cosines of these nine angles are given in Table 1, where for the present 
purpose the first row and column are to be used; for example, aig is the 
cosine of the angle between the two straight lines O'X’ and OZ. 


TABLE I 
OX oY OZ 
As Ay A, 
o'x’ Az an ai a 
o y’ Ay Q21 ee 23 
o'z’ Az G31 Q32 433 


In order to locate the vector in the system O’X’Y’Z’, it is necessary first 
to obtain relations between the nine direction cosines. From a well-known 


Fie. 4-1 


formula of solid analytic geometry, if 6 is the angle between two straight 
lines, whose angles with the coordinate axes are a), G1, Yi; a2, 82, Ye, 


cos 6 = COS a; COS ag + cos B; cos Bz + cos y; COS Y2 (4-1) 


If the lines are perpendicular to each other, as is true for the axes OY and 
OZ, cos 8 = 0, hence 


@12013 + a223 + @32033 = 0 (4-2) 


Five similar equations result for the other mutually perpendicular axes. 
Six further relations are obtained from the fact that the sum of the squares 
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of the direction cosines of any line is unity; for example, for the line OX 
relative to O'X’Y’Z’ 
ai + a3, + ay =1 (4-3) 


There are yet ten more relations, in addition to the twelve expressed in 
eqs. (2) and (3). Nine of them are of the type a1) = @22433 — a@32@23 and 
the tenth is the determinant of the cosines, which equals unity. It is 
evident that the nine cosines are not linearly independent. 

Now let (21,91,21) and (24,41,2,) be the coordinates of the same point P 
in OXYZ and O'X'Y’Z' and let a1, 81, Yı, a be the direction angles of 
O'P with OX, OY, OZ, O'X’. Then from (1) and Table 1, 


zi = O'P cos aa = O'P (ay, cos ay + Gia Cos By + aig COS y1) 
= a (£1 — to) + Gi2(y¥1 — Yo) + %13 (21 — 20) 
In like manner, 
yi = G21 (et, — zo) + a2e(¥i — Yo) + az3(21 — 20) 
zi = agı (£ı — to) + ass(yı — Yo) + azz (zı — Zo) 


Similarly, if the components of A in O'X'Y'Z' are 


t t f t f t / d , 
Å; = z2 — 2i Á= y2 — y A, = 2—8 


then, 
A; = ayAz t+ Gedy + agitz (4-4) 


and two other expressions for 47 and 4; may be derived in the same way. 
These three equations may be solved for the unprimed quantities in terms 
of the primed ones, or the same method may be continued to give three 
relations like 

A, = ay Al + aA, tad! (4-5) 


All of them are symbolized, in self-explanatory fashion, in Table 1 if the 
second row and column are used. While it is usually true that the com- 
ponents of a vector are different in different reference frames, certain proper- 
ties such as the length and the angle between two vectors are equal in all 
frames. It is readily shown, using (2), (8) and (5), that 


As A aVAP EAP 4 AP 


Considerable simplification often results in expressing physical laws 
in vector notation, without reference to a selected coordinate system. The 
transformation properties just described, however, show that it is always 
possible to list the components of a vector in any given reference frame 
when so desired. In accordance with these ideas, a vector is sometimes 
defined as a set of numbers (4,,4,,42) referred to a reference frame, so that 
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if the numbers are then referred to a second frame, they will become 
(A2,A2, A1) with relations as given by Table 1. Provided these conditions 
are met the vector is said to be a proper vector. This analytieàl definition 
is more restrictive than the intuitive conception of a vector as a quantity 
possessing magnitude and direction, but it leads naturally to the more 
general idea of the tensor and it may be readily extended to the veetor in 
n-dimensional space as used in many branches of modern analysis. More- 
over, the analytical definition is more precise than the usual one, which 
offers no explanation of the words “ magnitude and direction.” The three 
components of a vector define these words provided they are the same in 
all reference frames. Further comments on this matter will be found in 
sec. 4.21. 

4.2. Unit Vectors.—Vectors of unit length, drawn along the axes OX, 
OY, and OZ, respectively, are called unit vectors (cf. Fig. 2); they are desig- 
nated by i, j, and k, respectively. Any directed line along either of the 


Y 


Fra, 4-2 


three axes is also a vector, for if its length is A, units along the X-axis, the 
scalar magnitude is thereby given and its direction is specified by the unit 
vector i, the whole vector being designated by Azi. Similar vectors could 
be drawn along the FY- or Z-axes, A,j or A-K. 

4.3. Addition and Subtraction of Vectors.—Referring again to Fig. 2, 
it is seen that the diagonal of the parallelogram, whose two unequal sides 
are the vectors A,i and A,k, is also a vector, its origin being taken as coinci- 
dent with the coordinate origin. From the previous discussion, it follows 
. that the reference frame OXYZ is superfluous, so that the symbols Azi 
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and Ak may be replaced by more general symbols, A and B. The resultant 
vector C representing the diagonal is called the vector sum of A and B, 


A+Be=C 


The addition of vectors thus obeys the familiar rule for eomposition of 
forces in mechanics. To obtain the difference of two vectors, A — B, it is 
only necessary to define the negative of a vector. This is taken to mean 
a vector whose length is equal and whose direction is opposite to that of the 
original vector. Thus A — B = A + (—B). Hence the rule: To form 
the difference of two vectors graphically, reverse the direction of the minu- 
end and complete the parallelogram as before. 

From the parallelogram law, it is seen that any vector in a plane may be 
resolved in numerous ways into two components in the same plane, and 
that a vector in space may be resolved in numerous ways into three com- 
ponents, not in the same plane. If the resolution is made along the rectan- 
gular axes, the result may be symbolized in terms of unit vectors, 


C = Ai + Ak 
and 

R = Ai + A,j + A.k (4-6) 
From the geometry of Fig. 2, the lengths of C and R are 


C = (A2 + 4D” 
R = (A? + A? + 42)“ (4-7) 


l 


The laws which govern addition and subtraction of vectors are easily 
seen to be associative, commutative, and distributive. Multiplication 
of a vector by a scalar is understood to mean multiplication of its length 
by the scalar factor, without change in its direction. Vector algebra thus 
developed enables one to demonstrate many geometrical theorems in a 
simple way.” 

Problem a. Prove that the diagonals of a parallelogram bisect each other. 


Problem b. Prove that the line that joins one corner of a parallelogram to the 
middle point of an opposite side trisects the diagonal and is trisected by it. 


4.4. The Scalar Product of Two Vectors.—The scalar (or inner) prod- 
uct? of two vectors is defined by 

A-B = AB cos 6 (4-8) 

where @ is the angle between A and B. It follows that the scalar product of 


2 Numerous examples may be found in books on vector analysis; see for example: 
Phillips, H. B., ‘ Vector Analysis,” John Wiley and Sons, New York, 1933; Gibbs- 
Wilson, “ Vector Analysis,” Yale University Press, New Haven, Conn., 1925. 

3 Also called the dot product. 
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two perpendicular unit vectors must vanish since 0 = 7/2, cos@ = 0. 
Similarly the scalar product of a unit vector by itself must equal unity 
since 8 = 0, cos@ = 1. In vector notation, 
i-j=j-isi-k=k-i=j-k=k-j=0 (4-9) 
i-i=j-j=Hk-ki?=jfe=2+P=1 (4-10) 
If A = B, 6 = 0, so from (8) and (10), 
A-A =A? =A?+ A4?4+ A? 


: A 
AB cos 0=A-B | 


| 
Fig. 4-3 


an equation which defines the square of the length of A (see also Fig. 3). 
If 
A-B=0 (4-11) 


for any two vectors, A and B are perpendicular to each other, unless one 
vanishes; if 
A-B=AB 


then A and B are parallel. In a Cartesian system, 
A-B = A,B, + A,B, +.4,B, (4-12) 
The sealar product obeys the rules of ordinary multiplication 
A-B=B-A 
A. (B+C) = (A-B)+ (A-C) 


From (8), it is seen that any relation involving the cosine of an included 
angle may be written in terms of the scalar product. For example, the 
mechanical work W done by a force F which makes an angle @ with the 
displacement D is W = FD cos@ or in vector notation, W = F- D. 


Problem. If A and B are the sides of a parallelogram and C, D are the diagonals, 
A 
show that C? + D? = 2(4? + B*); C? — D? = 4ABcos AB. 
4.5. The Vector Product of Two Vectors.—Let two arbitrary vectors, 


A and B, be drawn from a common origin 0 with an included angle 9, 
0 <8 <7, and let C be a vector perpendicular to E, the plane of A and B 
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Cef. Fig. 4). Then from (11) and (12) 
C A=(C,A,+ C,A,+ CA, =0 
C-B = CBs + C,B,+C,B, =0 
Solving, we fnd 
C, = m(A,B: — A.B,) 
Cy = m(A,Bz — A,B:) (4-13) 
Cz = m(A,B, — A,B,) 


C=AxB 
B 
0 
A 
= 
Fie. 44 


where m is an arbitrary constant, which is conveniently taken as +1. 
Then from (6) and (18), 


C? = 03t C + CP = (AZ + Aj + AD) (Bi + B + B3) 
_ (A,B, + A,B, + A,B,)? 


The first member on the right-hand side of this equals A?B? by (6), the 
second member equals (A: B)? = (AB cos 6)? by (12) and (8), hence 


C? = (AB? — +?B? cos? 0) = (AB sin 6)? 


The vector C may thus be described as the product of two other vectors, 
A and B; itis called the vector (or skew) product* and is written 
C=AXB 


Its length is C = AB sin @; its direction is perpendicular to the plane deter- 
mined by A and B. Using (13) and the unit vectors, we may also write 


C =A XB = (A,B, — A.B Ji + (A,B, — A,B;)j 
+ (A,B, = A,B,)k 


4 Also called the cross product or outer product. 


4.5 VECTOR ANALYSIS 144 


This may be put in the form of a determinant: 


i j ok 
AXB=]| As A, A (4-14) 
B, B, B, 


As a consequence of (14), vector products of the unit vectors become 
ixj=—jXi=li j kl=k 
1 0 0 
010 


jxXk=-kxXje=i; ExXi=-ixk=j 
and 
ixiajxj=kxk=0 (4-15) 
Eq. (14) shows that vector multiplication is not commutative, 
AXB=-BxXxA 
The distributive law of ordinary multiplication, however, is retained: 
AX (B4+C)=AXB+AXC 
(A+B)x (C+D) =AXC+AXD+BXC+H+BXD 


Problem. Prove by vector methods the trigonometric relations 


cos (x & y) = cos x cos y F sin z sin y 
sin (x + y) = sin x cos y + cos z sin y 


Hint: Take three vectors: A = cos zi + sin zj 
B = cos yi -+ sin yj 
C = cos yi — sin yj 


Form the scalar and vector products. 


The close connection between the vector C = A X B and the parallelo- 
gram whose sides are A and B suggests that it may be useful generally te 
represent areas by vectors. The convention usually adopted in this con- 
nection, with reference to plane areas, is the following. The area is repre- 
sented by a vector perpendicular to the area; and of length equal to its 
size. This leaves the direction of the vector undetermined. The latter 
is fixed relative to the sense in which the contour of the area is described: 
it is taken to be that direction in which a right-handed screw would advance 
when turned in the sense in which the contour is to be described. When the 
sense of the contour is not specified, the direction of the area vector remains 
undetermined. For closed surfaces, it is customary to draw the vector 
along the outward normal. 

We now consider two important examples of vector products. 
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a. Moment of a Force. In mechanics, the moment of a force about a 
point O is defined as the product of the force by its perpendicular distance 
from O. From the geometry of Fig. 5a this product equals twice the area 
of the triangle OPQ. It may be represented as 


M=DXxXF 


where M, D, and F are vectors representing the moment, perpendicular 
distance and force, respectively. The sign of M, fixed by the previous 


Q 
F 
Q 
ð 
F 
of P 
D 
(a) 


D sin @ 
(b) 
Fia. 4-5 


definition of the area as a vector quantity, is positive on that side of the 
plane passed through O and the line F on which the force tends to produce 
a rotation about O in the positive direction. If D be drawn from O to any 
point in the line of action of F (cf. Fig. 5b), the perpendicular distance is 
D sin 6 and the moment is still given by the vector product, D X F. If the 
force has components Fe, F,, F: and D has components De, Dy, Dz, the 
components of M are 


5 
| 


= (D,F, — D.F,) 
M, = (D,F, — DF.) 
M, (D.Fy = D,Fz) 


b. Angular and Linear Velocity. Suppose a rigid body is rotating about 
a fixed axis, with a constant angular velocity of w radians per second. The 
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rotation of the body is then described by the vector œ with length equal to 
the scalar w and direction parallel to the axis of rotation. Its sign, by the 
convention, is positive in the same direction in which a right-handed 
screw would progress under the given rotation, Any point P, not on the 
axis (cf. Fig. 6), will then describe a 
circle concentric with, and in a plane 
perpendicular to the axis, this point 
being determined by any vector R 
drawn from a point O on the axis of 
rotation. The linear velocity of P is at 
right angles to both œw and R, its mag- 
nitude being L = wh sin 9, or in vector 
symbols 


L=oxR (4-16) 


4.6. Products Involving Three Vec- 
tors.—From three arbitrary vectors, 
A, B and C, the following products 
may tentatively be formed: 


(a) A(B-C) (d) A(B x ©) 
(6) A-(BXC) (e) A- (B-C) 
() AX (BxXC) (f) AX (B-C) Fia. 46 


Of these expressions, (e) and (f) are meaningless since vector products 
have only been defined when vectors stand on both sides of the det or cross. 
Furthermore, no meaning has been attached to two vectors standing 
together in the absence of one of these signs, hence (d) is of no interest 
here. 


a. Since (B- C) = BC cos @ is a scalar, the triple product A(B - C) isa 
new vector whose direction is the same as that of A; its magnitude equals A 
multiplied by BC cos 8. 


b. The product A+ (B X C), called the scalar triple product, is a scalar, 
for (B X C) = D, a new vector. We have 


A-(BxC)=(BxXC)-A=A-D=D-A =a scalar 
Moreover, the new vector D is perpendicular to both B and C, or from (11) 
B-(BXC) =C-(BXC) =0 


If the three vectors A, B and C are the edges of a parallelepiped, as shown 
in Fig. 7, then (B X C) is a vector whose length equals the area of the 
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parallelogram forming the base of the parallelepiped; its direction is per- 
pendicular to the plane of Band C. The scalar triple product is thus the 
area of the base multiplied by the projection of the slant height of A on the 


vector (B X C) or a scalar whose magnitude equals the volume of the paral- 
lelepiped, v. By taking various faces in turn, we find from Fig. 7, 


A-(BxXC)= B-(C XA) =C-(AXB) =0 


Since a change of order in the vector product changes the sign there are 
other possible relations which may be abbreviated by writing 


v = [ABC] =A-(BXC) = (BXC)-A, ete. (4-17) 
and 
[ABC] = [BCA] = [CAB] = —[ACB] = —[BAC] = —[CBA] (4-18) 


Each term in square brackets stands for the two possible ways of writing 
the triple product as shown in (17). It also follows from (18) that the 
cross and dot may be exchanged at will, provided the cyclical order of the 
three vectors is retained. The parenthesis in a product like A. (B X C) is 
superfluous but it is often written for clarity. 

Because of (15), the scalar triple products of unit vectors all disappear 
except . 

lijk] = —[ikj] = 1 

which follows from (9). If the three vectors A, B, C arewritten in terms of 
unit vectors and the indicated multiplications performed, the use of (9), 
(10) and (15) gives 


[ABC] = A,B,C, + B,C,A, + C,A,B, — AzCyB, — B,A,C, — CBA. 


A, Ay A; 
= B, B, B, (4-19) 
C, Cy C: 


2. The product, V = A X @ x C), called the vector triple product, is 
a vector since it is the vector product of two vectors, A and (B X C). It 
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is therefore perpendicular to both of its components: 
V-A=0; V-(BXC)=0 
but V must lie in the plane of B and C, since it is perpendicular to the vector 
product of B and C, which itself is perpendicular to both B and C. 
The most important property of this triple product is that it permits 
decomposition into two scalar products: 
AX BX C) = B(A.C) -— C(A-B) (4-20) 


a relation which may be proved geometrically or analytically by expanding 
in Cartesian coordinates. Since the vector product changes its sign when 
the order of multiplication is changed, the sign of the triple vector product 
will change when the order of the factors in the parenthesis is changed or 
when the position of the parenthesis is changed: 


Ax ®BxC)=-Ax(C KB) =(CxXB)XA=~-(BXC)XA 
Products of more than three vectors may always be reduced to one of the 


three preceding types of triple products by successive application of the 
above rules. 


Problem. Verify the relations: 
(A X B)- (C X D) = (A-C)B-D) — (A- D)B-C) 
(A x B) x (C x D) = B[ACD] — A[BCD] = C[ABD] — D[ABC] 
[A XB BxC C x Aj = [ARC]? 


AR 


Fra. 4-8 


4.7. Differentiation of Vectors.—If a vector R is a function of a single 
scalar t, which for convenience may be assumed to be the time, there are 
three possible ways in which R may vary. Let R, and R, refer to times 4 
and fe, then R: may differ from R,: (a) in magnitude only; (b) in direc- 
tion only; (c) in both magnitude and direction as shown in Fig. 8. Since 
no complication arises from treating the general case, let us assume that a 
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curve is traced by the terminus of a continuously varying vector R, the 
origin of the latter being kept fixed at the origin of a coordinate system. 
The vector AR = Ry, — R,, having the direction of the secant AB of 
Fig. 8, approaches the tangent of the curve C at the point Ry as At = fh — i 
approaches zero. The quotient AR/At is the average rate of change of R 
in the time interval between ti and ts» Following the usual methods of 
differential calculus, the derivative of R is defined as 
lim AR = dR 


1—0 At dt 


In terms of unit vectors, and with the use of primes for differentiation 
R =if, + jk, + ER: 
= iR, + jR, + kk; 
R” = iR” + iR + kR” 
x Y z 


For a composite function of two or more vectors, each of which depends on 
the single scalar t, the usual rules of differentiation hold except that, of 
course, the order of the vectors must not be changed in cases involving the 
vector product. 

In the special case of Fig. 8a, where R is constant in direction but vari- 
able in magnitude, AR is parallel to R. Similarly, in case (b), AR is 
perpendicular to R, for the fixed length of Ris R- R = R?, d(R-R)/dit = 
0 and hence R-dR/di = 0, the latter being the requirement that R and 
dR/dt be perpendicular. 

4.8. Scalar and Vector Fields.—A scalar field is defined as a region of 
space, with each point of which there is associated a scalar point function 
(ef. sec. 1.7). A simple example is the temperature of points in the atmos- 
phere at a given moment. On the other hand, if there is a vector associ- 
ated with each point in a region of space, the points and vectors constitute 
a vector field, an example being the wind velocity of points in the atmosphere 
at any instant. 

Suppose ¢(2,y,z) is a scalar point function referred to a given coordinate 
system. It will usually change its form if referred to another system, say 
¢’ (xyz) but its value at any point must be unchanged, or ¢ = ¢’. For 
example, the temperature at any point in the atmosphere cannot depend on 
the coordinate system used to describe the point. Differentiating ¢ = ¢’ 
partially, we obtain 

dp’ dx ð | dy dd _ dz ð$ 


dz’ da’ dx + ax’! dy ax’ ðz 


ð 
= ayy fay a ta = 
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ag’ dg ag ag 
ay! 2155 + az By + a23 32 
ag” ag 


ad dp 
7 = 431 Oe + age ay + G33 


dz dz 


with three similar equations for d¢/dz, d6/dy, 4¢/dz. Comparing these 
derivatives with (4) and (5), it follows that (0¢/02,0¢/dy,d¢/dz) are the 
three components of a vector since they transform from one reference frame 
to another in the manner prescribed for vector components. 

Using the abbreviation 


y = ið/ðz + jð/ðy + kð/ðz (4-21) 


let us study the quantities Y * y, where y is either a scalar or a vector and 
(*) is either to be omitted or replaced by a dot or a cross in order to give 
products which have meaning. The operator, V, called ‘‘del,” is not a 
vector in the geometrical sense since it has no scalar magnitude, but it does 
transform properly, so that it may be treated formally as a vector. The 
possible products are Yo, where ¢ is a scalar point function; V- V and 
y X V, where V is a vector field. 

4.9. The Gradient.—The first of these products, called the gradient of 
the scalar ¢ 


Vo = grad } = idg/dx + j9p/dy + kðg/ðz (4-22) 


is a vector, since it is the product of a scalar ¢ and a vector V. To per- 
ceive its physical significance, let us consider the family of surfaces, 
o(z,y,2) = constant, or the equivalent of this relation 


dp = (dp/dr)dz + (3p/ðy)dy + (d¢/dz)dz = 0 


At any point P with coordinates (x,y,z), on one of these surfaces dR = 
idz + jdy + kdz is a vector, tangent to P, provided dx, dy, dz satisfy the 
preceding equation. Since Y¢-dR = df =0, dR and Yọ are perpen- 
dicular to each other, or Yø is perpendicular to that surface of the family 
which passes through P. By the convention of signs previously estab- 
lished, the direction of Yẹ is that in which ¢ is increasing. For any other 
direction determined by the unit vector s with direction cosines (m,n) 
through P, the component of Vo in the direction s is 


lag/dx + mdo/dy + ndo/dz 


which may be written s- Vo. This is the directional derivative of ¢ 
in the direction s. In going from P on one of the surfaces (ef. Fig. 9) 
to any point Q on the surface ¢ + dé, the increase in ¢ is the same wherever 
the point Q is chosen, but the distance PQ will be smallest and hence s: Yẹ 
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greatest when s is in the direction of the normal N. Therefore, since Ye 
is normal to the surface ¢ = const., at the point P its direction and magni- 
tude give the maximum space rate of increase of the scalar ¢. 


+de 


$=const. 


Fra. 4-9 


4.10. The Divergence.—The scalar product of the vector operator V 
and a vector V gives a scalar which is called the divergence of V. 


ð ð 
V-V =div V -htd ról tvs +v, +v 
ox ay oz 
= dV,/dx + dV,/dy + dV,/dz (4-23) 


If V is a vector field, the 
derivative dV,/dr trans- 
forms, when a change of 
coordinate system is made, 
like the product A,B, of 
the z-ecomponents of two 
vectors A and B, hence the 
divergence of V is a scalar 
point function. Suppose 
that V represents at each 
point in space the direction 
and magnitude of flow (den- 
sity times velocity) of some 
Bia. 4-10 fluid such as water or a 

gas, or that it represents 

thermal cr electrical flux. Consider, for example (Fig. 10), a small 
parallelepiped of volume drdydz = dr, through which a fluid is passing. 
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The loss of fluid mass through face ABCD per unit time is 
av d 
i. [Viena +— z dydz 
ðr 2 
while the gain through EFGH is 


av de 
2 


i- [Viau -a | avaz 


Therefore the net loss through these two faces is 
GA 
i - — drdydz 
ox 
The iosses through the other two pairs of faces are 
av av 
j: drdydz and k--—- drdydz 
oy Oz 


The total loss from the parallelepiped is therefore 
av av av 
l = tag + k- dedydz = V Vdr 
y ôz 


If v is the velocity of the fluid of density p, V = pv is called the flux density 
and represents the total flow of fluid per unit cross section in unit time. 
Then if no fluid is created or destroyed within the parallelepiped, this loss 
of mass must equal — (dp/df)dr, 


a relation usually called the equation of continuity. If the liquid is incom- 
pressible, dp/df = 0, hence 


V-V=0 
A sunilar relation holds for D, the electrice displacement, 
V-D=0 
4.11. The Curl.—The vector product of V and V is called the curl or 


rotation of V 
(dV, ôV: 
73 | ðz ~ = 
i j 


k 
av, av, aa ð 
+ ee 2 


V, a¥, 
curiV=0xXxXV =i [2s _ al 
. Oy Oz 


=|= = -24 
ox oy Ox dy G 
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This function may be used to describe the motion of a rigid body rotating 
about an axis with uniform angular velocity, œ. The linear velocity of any 
point P in the body with radius vector R is (cf. 16) LL = @ X Rand 


curl L = ¥ X @ X R) (4-25) 
Expanding (25), by (20) 
curl L = o (V-R)— (V-o)R 


Since R = iR, + jR, + ER: = iz + jy+ke,V-R = 3. The angular 
velocity is a constant vector, hence V -œ = œ- V and we may write the last 
member of the above equation in the form (m- V)R, which is to be inter- 
preted as the product of a scalar (œ - V) and a vector R. Expanding, 


ð a fi] . . 
(fo: VR = og tog +o SIR = iw, + jo, + kw, =o 


Hence, curl L = 30 — o = 20 


or the curl of the linear velocity of any point of a rigid body equals twice 
the angular velocity, for magnitude, not direction changes. 

4.12. Composite Functions Involving V.—The following relations 
involving VY may be verified by expanding the vectors in terms of their 
components along three unit vectors, i, j and k. 

Ve(A+B) =VeAt+v«B 

Ve(¢A) =V A+ OV¥A (4-26) 

VU-V) =(V-VU+(U-V)V+VxXVXxXU+0UxK WX VY) 
V-(UxXV)=V-vVxU-U-Vxv 
Vx (UX V) =(V-V)U— WV: U) - (U-v)V 4 OV) 

In these equations A and B are either scalars or vectors depending on the 
choice of (x), œ is a scalar and U, V are vectors. If R = ir + jy + kz, 


yY- R=3 
7XxXR=0 
U-yR=U 


Problem. Prove eqs. (26). 


4.13. Successive Applications of Y.—There are six possible combina- 
tions in which Y occurs twice. The following relations may be proved ag 
above by expansion in terms of i, j and k. 

a. V -Yp = V°6 = V -grad ọ = div grad ¢ 

3e 3e 3e 
T 9a? ay? * a2 
The operator V? is generally called the Laplacian. 


(4-27) 
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b. Since V? is a scalar, it may also be applied to a vector, the result 

being a new vector 
av ev ay 
ax? * ay? dz” 
(A V(V -V) = grad div Y 

Vs 8V; ƏV, fV, e 
: . k , 

ax” t] dy? + zti laxdy ðrðz 


ðz 

av, V: Va eV, 
+ k +— 
axrdy dydz ðrðz = dydz 


(V-WV = VV = (4-28) 


+i | (4-29) 
d. Y X Vo = curl grad 6 = i j k = 0 
ð/ðx ð/ðy ð/ðz 
d¢/dx 3¢/dy ð¢/ðz (4-30) 
This is an identity. If for some vector V, Y X V =0, then V = Vo, 
where ¢ is some scalar function. Under these conditions, V is said to be 
irrotational. Expansion also yields 
e. Y-Y xV = div curl V = 0 (4-31) 
Thus if for any vector W, Y - W = Othen W = Y X Vand W is said to be 
solenoidal. 
Finally, the reader will easily check by expansion in rectangular com- 
ponents, the relation 
fy x (Y xX V) = curl curl V 
grad div V — V?V =V(V-V)-V-VV (4-82) 
Problem. Show by expansion that eqs. (4-27, 28, 29, 30, 31, 32) are correct. 


It 


4,14. Vector Integration.—As a simple example of vector integration, 
we consider the motion of a particle under the constant acceleration of 
gravity. The equation of motion is 


@R/d? = G 
where G is a constant vector. Integration results in dR/dt = Gt + Vo; 
R = Gi?/2 + Væ + Co, where Vo and Co are the constants of integration 
which are vectors not necessarily collinear with G. They are determined 
by the values of dR/di and R, respectively, when ¿ = 0. 
More complicated cases may arise, however, for in the general case, the 


integral is f y «dr, where y and dr may be scalars or vectors, (x) has the 


same meaning as before and the integrals may be multiple. 
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4,15. Line Integrals.—Suppose dr is the vector ds, where s = s(f) 
is the equation for a curve. Itis then possible to form the integrals: 


(a) J. gds; (6) J. V-ds; (o J Vxds (433) 


each of these being called the line integral along the curve C. The results 
of integration are respectively, a vector, a scalar and a vector. 
Since 
ds = idr + jdy + kdz (4-34) 


the first integral in (33) becomes: 
B Xe 
f eas = f p(z,yz) (ide + jdy + kdz) = f 6 (x,y,2)idx 
c A ay 


Ys 7a 
+f oeweriay + f ovak 
yı z1 


where A and B are initial and final points of the curve, with coordinates 
(21,41,2,) and (£2,y2,22). The first integral on the right may be evaluated 
when y and z are known in terms of z for points on the curve C. The 
remaining integrals are determined in a similar fashion. ‘The problem thus 
reduces to the usual line integral in scalar calculus except that it is neces- 
sary to specify the direction in which the radius vector s describes the 
curve during integration, for if the direction A to B is taken as positive, 


then 
B A 
f ods = -f ods 
A B 


In case C is a closed curve, the direction is always taken so that the enclosed 
curve appears positive (ef. see. 4.5). 

No difficulty is experienced in the interpretation of (b) and (c) of (33) 
as the following example shows. Let V = xyi — 2?j + xyzk; evaluate 


f V-ds from the point A = (0,0,0) to B = (1,1,1) along the curve 
s = it + je + ke. 


B B 
f V-ds -f (zyi — 2?) + zyk) - (ide + jdy + kdz) 
A A 
B 
= f (rydx — z*dy + xyzdz) 
A 


Since ds is the position vector of points on the curve, the coordinates of 
any point in terms of ¢ are 
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1 i 1 
f V-ds = f zdz -f ydy +f zdz 
A 0 0 0 


— l 
~ 3 
An important special case arises in scalar calculus when the function to 
be integrated is an exact differential, where the value of the integral is 
independent of the path. In vector calculus, suppose 


V = grad ¢ = Vo (4-35) 


Hence, 
B 


with ¢ a scalar point function. Then using (22) and (34) 


B B B ô ô 
f V-as = f voas = f (Eart ay + al 
A A a lôg ay dz 
B 
-f dé = op — pa (4-36) 


If the integration is taken around a closed curve, B = A, then 


A 
f ve-ds = $ yg-ds=0 
A 


Conversely, if $ V -ds = 0, then (35) must hold, i.e., V is the gradient of 
some scalar point function 6. We have therefore shown that if V = grad 4, 
B 
the line integral f V - ds depends only on the initial and final values of ¢ 
A 


and is independent of the path. 

4.16. Surface and Volume Integrals.—Let Z be any surface, divided into 
infinitesimal elements each of which may be considered as a vector, dS. 
The surface integral may then be described as in ordinary analysis, but 
again there are three cases: 


f fes; of f v-as; off yxa 


giving a vector, a scalar and a vector. As before, it is important to specify 
the side of the surface over which the integration is performed, for although 
dS is normal to the surface, the signs of the normals on opposite sides are 
opposite. The sign of the normal is uniquely determined by the previous 
conventions except for the case of a one-sided surface® such as the Mobius 
strip. Ifthe surface encloses a portion of space, dS is taken as the outward 


5 See, for example, Burington, R. S., and Torrance, C. C., “ Higher Mathematics,” 
McGraw-Hill Book Co., New York, 1939, pp. 2508. 
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pointing normal. The surface integral f f V- dS is called the flux of V 
s 


through the surface, for if V is the product of density and the velocity of a 
fluid, the integral is the amount of fluid flowing through a surface in unit 
time. The vector V may also refer to electric, magnetic or gravitational 
force, fow of heat and so on. 

Let dr = dadydz be an element of volume. Since this is a scalar, there 
are only two possible volume integrals 


(a) Jf fee ©) Sf five 


the first being a scalar and the second a vector. 
It is often convenient to convert multiple integrals into others with 
fewer integral signs. One possibility has previously been presented in (36), 


namely that the line integral f V -ds may be reduced to the difference 
c 


between two scalar quantities, provided V = Yọ. A line integral may also 
be converted into a double or surface integral by Stokes’ theorem, or con- 
versely, the double integral may be reduced to a single integral. 

4.17. Stokes’ Theorem.—This theorem may be stated in the form 


fY as f fyxv-as (4-37) 


Conversely, if W = Y x V, where V is another vector, then the value of 
the surface integral f f W - dS depends only upon values of V at points 
s 


on the boundary of the surface, 


J [ws - $v-as 


The vector V may be taken as flux density of a fluid or as the field of a 
mechanical or electrical force. In the latter case, the line integral repre- 
sents the work done on a particle moving along a curve C. If the curve is 
closed, forming the boundary of a region £, then according to the theorem 
the work done equals the surface integral of the curl of the force field. 
In the special case where the work done is independent of the path, the line 
integral vanishes so that a requirement for independence of the path is that 
Vx V=0. 

A proof of Stokes’ theorem follows. Consider a surface È bounded by 
the closed contour C. Let C’ be the projection of © on the X Y-plane; 
we are thus associating a point P(z,y) on the plane with every point 
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P’ (z,y,2) of the surface. This means that on the surface, where z is a fune- 
tion of x and y, a funetion u(z,y,z) becomes 


u(x,y,z) = o(x,y) (4-38) 


since the value of ¢ on C’ must equal the value of u on C. Similarly with 
other functions 


v(z,y,2) = x(2,2); w(a,y,2) = ¥(y,2) 


when projections are made on the XZ- and YZ-planes. We may write for 
the vector defined at each point on the surface, 


V = ui + oj + wk 


If we furthermore take a unit vector n, perpendicular to the surface at any 
point, the right-hand side of (837) becomes after expansion 


f fav x vas = ffn xwtv xoitv x wks 4-39) 
5 s 
A typical term of (39) may be transformed as follows | 
ð 
= —n-k-=- (4-40) 
oy 


the second expression coming from (24). The last member of (40) is 
obtained as follows. The partial derivative 


ofs = zi + yj + zk is a vector, tangent to the curve cut from 2 by a plane 
perpendicular to the X-axis. It is perpendicular to n, hence 


a-{j+eS} =o (4-41) 
Substitution of (41) and the partial derivative of (88), 

a6 du , ôu de 

oy dy Oz Oy 
into (40), gives the last term of that equation. Sincen-kdS = dxdy, we 


may write 
ð 
ff =y xus =- f fZ aay (4-42) 
8 oy 


The integral on the right of (42) may be written 


Sf Seeds = fe- oa 
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where ġ and ¢; are the values of ¢ at the maximum and minimum values 
of y, Y2 and yı, respectively. If do is a line element of the contour C’, we 
may write dr = +(d2/dc)deo, choosing the sign in accordance with the 
position of de on the contour. Since it is negative at ye and positive at yı, 
the integral becomes 


-f (be + 61) Fae = ~ foes 


Remembering (38) and the fact that between two points on C’ the change 
in z is the same as that between the equivalent points on C, 


$ ġdr = $ udr 
cr c 
so that we finally have 


f fey xuas = Suaa 
S c 


Similar equations are obtained from consideration of projections of £ on 
the XZ- and YZ-planes. When they are added together, Stokes’ theorem 
results. 

4.18. Theorem of the Divergence.—A method of reducing triple inte- 
grals to double integrals is offered in the theorem of the divergence, which may 


be written 
Lf fv ver =f fya (4-48) 
T S 


The Cartesian form of this equation 


dV. _ Vy | 3V] -ff 7 
SIS Ja + ay + zz je = C zoydz + V dxdz + Vadrdy) 


is often called Gauss’ theorem. Suppose V represents the flux density of an 
incompressible fuid. Then, as we have shown, V - V is the total amount of 
fluid flowing out of a volume dr per second. The total flow from a large 


volume is f f f Y- Vdr, which must equal the rate of flow across all of 


the surfaces of the volume f f V-dS. This proves the theorem.® If 


we assume a steady state, the total amount of flow neither increases nor 
decreases in time and hence must be maintained constant by sources or 
sinks within the region, unless the density of the fluid is continually chang- 


€ An analytical proof, which does not depend on the flow of a liquid and is similar 
to the one given here for Stokes’ theorem, may be found in books on vector analysis. 
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ing (which is contrary to the initial assumption). In view of Gauss’ 
theorem the divergence of the field takes on an interesting meaning. Since 


div V =y- V= af fv dS (4-44) 
ins 


the divergence is the same as the intensity of the steady flow at a given 
point. This argument may be continued to derive the equation of con- 
tinuity which has been obtained in another way in sec. 4.10. 

A further application of the divergence theorem arises in the problem of 
beat fow. Consider the flow of heat into a thermally isotropic solid body, 
the temperature of which is not the same at all points. The rate of flow 
of heat into the body is* 

— f V-ds 
S 


where V is the flux of heat, the amount of heat which crosses unit area 
drawn perpendicular to the lines of flow per unit time. By Fourier’s law, 
heat flows in the direction of most rapid decrease in temperature, U, witha 
rate proportional to the thermal conductivity «, of the solid or 


V = —-—x«vU (4-45) 


If there are no sources or sinks of heat within the body, and if p is the 
density of the solid and s its specific heat, the amount of heat entering unit 
volume in unit time is 


For the whole body, the heat gained must equal that passing through the 


surface 
f sp Hr -S V-dS 


and this becomes in view of eq. (43) 


au 
ff ry AGM dr =0 


This equation must hold for every surface, hence 


* Henceforth, single integral signs will be written in multiple integrals when the 
meaning is otherwise clear. 
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Thus because of (45) 


U 
S, OE = V- (xkvU) 
or 
aU L pyy 
ðt 


with u? = x/sp and x assumed constant. For a stationary state, V?U = 0; 
- this is Laplace’s equation, the same law holds for the distribution of 
temperature as for the distribution of potential in charge-free space. 

4.19. Green’s Theorems.—The three fundamental relations (36), 
(37) and (48) may be used to obtain a large number of formulas for the 
transformation of integrals, the results corresponding to integration by 
parts in scalar calculus. The two most important such formulas are 
known as Green’s theorems, when given in Cartesian form. In vector 
notation, these are 


J Vé - Vydr = J éVy-dS —- J V°ydr 


- J Vo- dS — J wv ¢dr (4-46) 


f (@V"y — YV’g)dr = Í (Vy — 4ye) -dS (4-47) 


Green’s first theorem is easily found by substituting V = Vy in (43)- 
The second theorem is obtained by interchanging ¢ and y in (46) and sub- 
tracting the result from (46). 


Problem. Verify eqs. (46) and (47). 


4.20. Tensors.—In many physical problems, the notion of a vector is 
too restricted. For example, in an isotropic medium, stress S and strain X 
are related by the vector equation S = kX, X and S having the same direc- 
tion. If the medium is not isotropic, S and X are not in general in the 
same direction; it is then necessary to replace the scalar k by a more 
general mathematical construct capable, when acting on the vector X, 
of changing its direction as well as its magnitude. Such a construct is a 
tensor. A similar generalization has to be made in the vector equations 


P=& 
where P and E represent electric polarization and field strength, 


I = pH 
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where I and H represent intensity of magnetization and field strength; for 
anisotropic media, the susceptibilities e and u must be replaced by tensors. 

Again, if it is desired to represent the displacements dv of the points 
in a strained elastic medium as functions of their position vectors V, a 
tensor equation of the form 6v = tv is needed, for dv and v differ in direc- 
tion, and the tensor t must effect this difference. This example will be 
treated in detail in see. 4.23; but first, we shall discuss the analytical 
properties of tensors. 

Let us consider for complete generality a space of » dimensions and 
assume that two different reference frames are given so that a point whose 
coordinates’ in the first one are (zt? +”) has the coordinates 
(#),2°,- + P) in the second system. Further let there be relations 


zm = NORR _ iw); r” = (#1 E, . 2’) (448) 
m = 1, 2, 3,---,% 


so that we may transform from one system to the other. Then if y quanti- 
ties (A1,A”,---,A”) are related to v other quantities (41,A?,---,4’) by 
the equations 


A™ = EAS mah deny (4-49) 


they are said to be the components of a contravariant vector or a tensor of 
thefirstrank. Tosimplify the notation, it is customary to omit the summa- 
tion sign and sum over indices which are repeated on the same side of the 
equation. An index which is not repeated is understood to take succes- 
sively the values 1, 2, -- -, »,so that there are altogether » different equations. 
With these conventions, we may rewrite (49) as 


am 
ow _ oF i 


m 4-50 
8 at ( ) 


A further word about notation should be added. Since a repeated index 
(it is often called a dummy or umbral index) indicates summation, another 
letter may be substituted for it at will. Thus (50) may also be written 


We will often use the same symbol such as A’ to indicate both the tensor 
and the i-th component of a tensor. No confusion should result from this 
arrangement. 


7 The upper suffix is not an exponent. Its position has an important meaning as the 
subsequent discussion will show- 
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A covariant vector with components Am in one system and A,, in another 
is defined by the relation 


~ 8 
An = Z A; (4-51) 
ôT 
Tf (48) is differentiated we obtain 
an” . 
dz" = — drt (4-51a) 
ax 


hence we see that the components of an ordinary vector in »-dimensional 
space are actually the components of a contravariant tensor of rank one. 
To find an example of a covariant vector consider a scalar point function 
el) = o(@”). The components of the gradient of ¢ will be ay¢/dz™ and 
dG dy ðr 
az” az az” 
Thus the gradient of such a function is a covariant vector. The reader 
should not conclude, however, that a covariant vector is necessarily the 
gradient of a sealar. 
These ideas may be extended easily to define tensors of any rank. 
If p(z™) = S(%”}, we speak of ọ as a tensor of zero rank or a scalar or 
invariant. There are three varieties of second rank tensors? defined by the 
transformations 


- az” ƏT” 

= E aah (4-52) 
~ ax’ ax? 
Amn = =m aE” Ai (4-53) 
ma OF" dx) a , 

AN = oat age? (4-54) 


They are called contravariant, covariant and mixed, respectively. A use- 
ful mixed tensor of the second rank is the Kronecker delta, 


& =l; men 


0; men (4-55) 


This is seen as follows. Suppose ô is this tensor in the coordinate system 
x’; then from (54) 


gn, OF ax! x = OE" 3a! 
"aa? 82" 7 art ag” 
_ 05" am 
gz" n 


3 Tensors of the second rank are also called dyadics; see Gibbs-Wilson, loc. cit. 
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We thus see that 6” has the same components in all coordinate systems. 
Tensors of higher rank are defined by similar laws, for example, a 
mixed tensor of rank four is 
az” ax! ðr" ax", 


Ana = aa og ap at (4-56) 


It should be noted that if » is the number of dimensions of the coordinate 
system, then a tensor of rank a has v” components. 

4.21. Addition, Multiplication and Contraction.—-The sum or difference 
of two or more tensors of the same rank and type is a tensor of the same 
rank and type. For example, if 


Am + Brr” = cm 


it follows from (52) that C”” is a tensor. It frequently happens that the 
components of a tensor satisfy the relation 


Am = Ap 


such a tensor being called symmetric. On the other hand, if A™™ = —A™, 
the tensor is skew-symmetric. When neither of these relations holds, a 
given tensor may always be written as the sum of a symmetric and a skew- 
symmetric tensor. To see this let us’ take 


sm _ $(A™ + A); Ten = $(4™ _ Anm) (4-57) 
where A” is neither symmetric nor skew-symmetric. Then 
Aan = sm + pma 


The property of being symmetrie or skew-symmetric is unaltered when a 
tensor is transformed from one reference frame to another. 

An important relation exists between vectors and skew-symmetric 
tensors. Suppose C = A X B, where the components of C are given by 
(13). But the components of A (or B) form a skew-symmetric tensor, 
Qij = —G;i, Gy = 0, where a3 = Ay, G21 = Az, Q32 = Az. We note, how- 
ever, that if the vectors A and B were drawn in a left-handed coordinate 
system, their directions would both be opposite to those in a right-handed 
system while C, their vector product, would have the same direction in both 
coordinate systems. 

The more common type of vector, such as that representing translation 
or a mechanical force, is often called a polar vector to distinguish it from a 
vector C which has the unusual behavior just described. The latter, called 
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an axial vector or a pseudovector,® requires the idea not merely of a displace- 
ment, but of some basic direction such as that implicit in a right-handed 
(or left-handed) coordinate system. A typical example of it is the vector 
product of two polar v ctors, like angular momentum or the moment of a 
force. A pseudovector in a three-dimensional Cartesian coordinate system, 
as we have seen, behaves in most respects like a proper vector but in the 
more general case it transforms like a skew-symmetric tensor. 

The scalar product of a polar vector and a pseudovector is called a 
pseudoscalar? It differs from a true scalar, which must have the same 
magnitude in all coordinate systems, since it will change its sign if the 
direction of its coordinate system is changed. 


Problem. Show that in two dimensions a skew-symmetric tensor of second rank 
is a pseudoscalar and that one of third rank is impossible; in three dimensions, that a 
second rank skew-symmetric tensor is a pseudovector and a third rank tensor is a 
pseudoscalar. 


If we write a tensor in matrix form and compare it with eq. (10-16) it is 
clear that the components of the tensor are also the elements of a matrix. 
The only difference lies in the fact that tensors may always be written in 
matrix form if so desired, but the elements of a matrix do not need to 
transform in the same manner as tensors. 

If we multiply A” by B, we obtain the mixed tensor A”B, = Cy. It 
is easily seen that C7 tranforms like (54). This type of product, called the 
ouler product, may be obtained with tensors of any rank or type; thus 
ATBog = Capg It should not be inferred, however, that every tensor can 
be written as a product in this way. Neither should we conclude that the 
outer product is the same as the vector product of sec. 4.5. 

Let us set m = q in the mixed tensor of (56) and write Bap = Anpm- 


To show that our notation, which indicates that Ajj is a covariant tensor 
of rank two, is justified we use the transformation law (56), 


dx ðr? y ax’ ðr? 
= aen qapi jkh T san nap“ jki 
OF” oT Or” Oz 


Comparison with (53) convinces us that 4%, is indeed a covariant vector 
of rank two shice it transforms in the required wav. This process of 
summing over a pair of contravariant and covariant indices is called 


? For further properties of them, see Herbert Goldstein, “ Classical Mechanics,” 
Addison-Wesley Press, Inc., Cambridge, Mass., 1951. 
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contraction, Is always reduces the rank of a mixed tensor by two, thus 
when it isapplied to a mixed tensor of rank two the result is a scalar: 

am oz” aa? i i am 

When two tensors are multiplied together and then contracted, we speak of 
inner multiplication, thus 


A™”Bnpa = Cpg A™Bm = a scalar 
The last example is clearly equivalent to the scalar product in rectangular 


coordinates (cf. sec. 4.4), hence in tensor analysis, we say that if l is the 
length of A” or Am 


P = A™AQ (4-58) 


Trom (8) we conclude that the angle 6 between two vectors Ám and Bym is 
defined by 


cos @ = AmB" 
O [(AnA™) (Bn B”)? 
and if Am and Bm are perpendicular to each other, 
AnB™ = 0 


We have just shown how new tensors may be obtained by addition, 
multiplication and contraction. We now inquire whether it is possible to 
change contravariant tensors to covariant ones or the reverse. Let gmn 
be any symmetric covariant tensor and g be the determinant of the com- 
ponents of gmn. Also let G"? be the co-factor!® of gmn in g, then if we define 


mn qn 
ge (4-59) 
it follows from the rules for the expansion of determinants that 
Imng?” = öm (4-60) 


We would like to justify our notation and prove that g™" is actually a 
tensor. Let A” be a vector, then Bm = gmnA” is also a vector, moreover 


g Bm = 9 "ImpA® = A?” = A” 
so that g”” changes a covariant vector into a contravariant one; hence it 
must itself be a tensor. 
Two vectors related by the equations 
Å” = g” "An 
or 
Am = Jmn å” 


10 Note that G™” is not a tensor. See sec. 10.8 for discussion of determinants and 
gec. 5.16 for further properties of these tensors. 
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are called associated. It is often said that both are the same vector, A” 
being the contravariant components and Ám the covariant ones. Tensors 
of any rank may be treated in the same way, thus 


AT” = g Pg” ng 
It should be clear that 
AmB” = A™ Ban} AmnB?® = Am B’n 


Because of the fact that dummy indices may be changed from one letter to 
another at will it follows that they enjoy a certain freedom of motion. 
They may be raised in one place if they are lowered in another. We have 
indicated this procedure in the last equation by spacing the indices. Such 
information is needed, for it is not true that 


Am” = mpå?” and An» = QpnAd*? 


are identical unless A is a symmetrical tensor. 

4.22. Differentiation of Tensors.—It has been shown in sec. 4.20 that 
the derivative of a scalar point function is a covariant vector. The deriva- 
tive of a covariant vector is not a tensor, however, for if 


o~ az? 

m 7 az” h 
dA, 8z oz? JAn 
oz" az" az” aE” az” 

ga xt ax? OA) 


azor” at 0%” dE" ax? (261) 
and the presence of the second derivative shows that @4,,/dE”" does not 
transform like a tensor. In order to find a “ derivative ” of the proper 
tensor character we first rewrite the second derivative in terms of first 
derivatives. To do this let us use the two tensors g,; and g defined previ- 
ously. Let us further introduce the followmg quantities (they are not 
tensors) called the Christoffel three-index symbols 


_ 189m, ng _ IYn 

[mng] = +( aar + gam ant (4-62) 
O9ms 4. 9ns Jmn 

{mng} = zor (He tym T ĉena) (4-63) 


the significance of which will soon be evident. From these definitions we 
see that 


{mna} = g%*[mn,s] (4-64) 
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According to (53), we have 
. ðr? ðr 
Gmn = ma ggal (4-65) 
and it is also true that 
gis Ages ax* 


ƏT = a* az (4-66) 
Differentiating (65) and using (66) we get 
Imn as aa? ðr? ari da’ da) dx" gi 
inn _ r = <= Fe =) + Se a (4-67) 
az? OF°0E" OF dE” azti dE” dE” azl dx* 
In the same way if we differentiate ng and mg we obtain 
dng _ s( Pat ani ort atm ) ðr? aa! dx gyu (4-68) 
ag™ T IH \ apm az" oze | az" azazt) | az” az" oT? or 
Ima _ ( az aa? = daxt ari ) ac’ daa! ac* agin (4-69) 
az" 9% arroz" ant ' ag” azrazt) | an” ab” OTt om 


We may exchange 7 and j in the second term on the right of these expres- 
sions. If we add (68) and (69), subtract (67) and use eq. (62) we obtain 
Brt ax? ðr aa? aa* 


nid) = 9 anag oxi + age am am OM 


where the bar over the Christoffel symbol indicates that it refers to the 
coordinate system 7”. Now multiply this equation by 9?"(dz*/d%") and 
use (64), which gives 


ach Pa’, Ox? Ox 
aa 94 agmagn age om 
ax* dx” ax’ da? 


gar OU Ax" att aa? 
+ 9" z oat oz” ar D" 


By means of (52) we may eliminate 9%" from the right-hand side of this 
equation to obtain 


{mn,r} 


, dat ðr? da? 
h kht 
10” Seman + om age © OA 
Finally, remembering that g,;g7" = g? = ð, we see that 
aa" ðz* ð ax? 
arar T POT Sor — am ar OM 
Let us put this result into (61) which then becomes 


ôA ——— dx* ðr? ax’, .. i dx’ ac? ðA; 
r= tia) Aa taag 0 
aE” [m7 ar aam ar YA Aet (4-70) 
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where we have changed the dummy indices in the last term from h and p to 
tand 7. Finally we see from (51) that we have 


so that (70) may be written 


_ ; ; l 
dAn _ (mar A, = oa" Eg _ fain) A | 


oz” 


Now if we use the comma abbreviation 


0A; . 

ip = n ijh A 
it follows that , 
- rt ðr 


mw =“ T - i,j 
dz” az” 


hence this quantity is a covariant tensor of the second rank. It is called 
the covariant derivative of A; with respect to giz. 
In a similar way it may be shown that the covariant derivative of A’* 

with respect to gij is 
. dA? 
Ay ass 
wt ð gi 


+ {jhi} A* (4-71) 


Problema. Prove that [mn,p] = gpq{mn,g}. 
Problem b. Show that second derivatives of tensors may be derived in the form 


0A;; . . 
Aizk = re — Asal gkyh} — Any lik,h} 


tj,k aa¥ th ; hi . 
A= se tA [Ak j] + AM {hki} 


t = 24 


Aix = m + AJ Aki) — Abt ikh| 


4.23. Tensors and the Elastic Body.—As an example of the use of the 
tensor in a physical problem let us consider a deformable body subjected 
to an infinitely small deformation or strain. Let Py be a point of the 
medium in the unstrained state and let P be its deformed position. If the 
coordinates of Po and P are xj and z” then the components ug of the 
displacement vector will be 


ub.= av — 2h = u5(xh,25.%8) (4-72) 


Suppose Qo is a neighboring point as shown in Fig. 11 which is deformed 
to the position Q. Now if the components of the vectors PoQo and PQ 
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are vj and g’, the coordinates of Qo and Q will be rb + 16 and a + y. 
It follows'! that 


ô r 
WH) = a5 +06) = 05 + (EE) e (473) 
0 
and, on using (72), that}? 
wt =v — oh = (=) of (4-74) 
ðr / o 


The coefficients (du"/dx*)y which relate the two vectors ôv" and vj, are the 
components of a tensor. The terms (@u'/dz!), (@u*/ax*), (ðu?/ðr?) 


Fy (23 ) Qo 
Fig. 4-il 


are tension strains parallel to the axes zl, x*, zê, respectively. The 
remaining terms are shearing strains about these axes; for example, 
(du? /aa' + du /dx) is the shearing strain about the axis perpendicular to 
z and z’. 

If the nine components of the tensor are written out it will be seen that 
it is not in general symmetric. However, it can be made so as shown in 
sec. 4.21. Dropping the zero subscripts from (74) we write 


v = fy? = éo t atv’ (4-75) 
where 


i 


i = (du"/dx"); & =E +2) 
a = $(G — &) (4-76) 


The coefficients e; are now the components of a symmetrical tensor, which 
is called a pure strain. It may be shown (see problem at end of this 


11 The zero subscript on the derivative is meant to indicate that it is evaluated at the 
point Po. 

12 This result only holds for rectangular coordinates. If (74) is to hold in generalized 
coordinates, we must use the covariant derivative of v’. 
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section) that w, represents a retation of the neighborhood of Po about P. 
We could also add to (75) a translation by the amount a”, so that 


pi r T 
òv = a" + ewt + ely? 


represents the most general displacement of an elastic body, the total 
motion being compused of: (1) a translation, (2) a pure strain, and (3) a 
rotation. 

‘This brief discussion of tensors is entirely inadequate to indicate its 
great value in mathematical physics. The subject has been most fre- 
quentiy employed in the general theory of relativity. It may also be 
applied with advantage in the study of dynamics, electricity, and hydro- 
dynamics.'* The material presented here is sufficient for the use which 
will be made of tensors in this book. 


Problem. Show that the tensor w; represents a rotation. 
Hint: Write out the components of (76) and it will be seen that the resulting vector 
is the vector product of two other vectors. 


13 Eddington, A. 5., “ The Mathematical Theory of Relativity,” Second Edition, 
Cambridge Press, 1930. 

14 These subjects have been so treated by McConnell, A. J., “ Applications of the 
Absolute Differential Calculus,” Blackie and Sons, London, 1931, and more briefly by 
Thomas, T. Y., “ The Elementary Theory of Tensors,”” McGraw-Hill Book Co., New 
York, 1931. See also Kron, G., “ Short Course in Tensor Analysis for Electrical Engi- 
neers,” John Wiley and Sons, New York, 1942. Tensor methods have been used to 
discuss the elastic properties of solids by Partington, J. R., “ An Advanced Treatise on 
Physical Chemistry,’ Vol. 3, The Properties of Solids, Longmans, Green and Co., 
New York, 1952. 
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CHAPTER 5 


COORDINATE SYSTEMS 
VECTORS AND CURVILINEAR COORDINATES 


5.1. Curvilinear Coordinates.—Although the methods of vector analysis 
prove convenient in the statement of physical laws, it is usually necessary 
to rewrite the vector equations in terms of suitable coordinates before the 
final solution of a specific problem can be obtained. It is the purpose of 
this chapter to show! how the components of vectors or vector operators 
may be formulated in a system of curvilinear coordinates, the latter being 
of so general a nature that it is an easy matter to transform from them to 
any one of the several kinds of special cocrdinate systems which have 
been found useful in physical problems. 

In Cartesian coordinates, the position of a point P(z,y,z) is determined 
by the intersection of three mutually perpendicular planes, £ = const., 
y = const.,z = const. When z, y and z are related to three new quantities 
by the equations 


z = 2(491,92,93) 
y = y(41542,93) (5-1) 
z = 2(q1,92,93) 

with inverses, 
a= a (x,y,2) 
G2 = g2 (2,4,2) (5-2) 
g3 = 93(Z,y,2) 


a given point may be described by specifying either x, y, z or q1, go, 93, for 
each equation of (2) represents a surface and the intersection of three such 
surfaces locates the point. The surfaces gq; = const., qo = const., 
g3 = const, are called the coordinate surfaces; the space curves formed by 
their intersection in pairs are called the coordinate lines. The coordinate 
azes are determined by the tangents to the coordinate lines at the inter- 
section of three surfaces. They are not in general fixed directions in space, 
as is true for simple Cartesian coordinates. The quantities (q1,g0,g3) are 
the curvilinear coordinates of a point P(c,y,z). 


1 The relations which we derive here may be obtained in other ways; see sec. 5.16 
and Hobson, E. W., “ The Theory of Spherical and Ellipsoidal Harmonies,” Cambridge 
Press, 1931. 
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From (1), 
dz = = qı + 5 daa + F dos 
ay = a + $4 dgs + SE das 
ae =F an + = ets da3 


hence the square of the distance between two adjacent points, 
ds? = dz? + dy? + d? = Qisdgi + Q3adg3 + Qsdg3 


+ 2Qredqidge + 2Qisdgidgs + 2Qesdqedq3 (5-3) 
where, 


ox Ox oy ð dz dz 
Qu = 22 ÊE 5 Ou OY | de oe 


© ðqiðg; — 8q:8g3 qi 9g; 
az) (24) (2) E oo. 
2 
w= (s-) +I] +h (i,j = 1,2,8; i #7) 
(G 8g: oq: 
For convenience we shall hereafter omit a repeated subscript, writing for 
instance Q; instead of Qi. 
The distance between two points on a coordinate line is called the line 
element. It is given by eq. (8) when variation is limited to only one of the 


ho 
qs, 


(5-4) 


ds; = Qidqg; (i = 1,2,3) (5-5) 


The direction cosines between these line elements and dz, dy or dz may be 
arranged as shown in Table 1 of sec. 4.1; for example, the cosine of the 
angle between ds, and dz is (d2/dq1)(dq1/ds,) = (d2/dq1)/Q1, and the cosine 
of the angle 0,; between ds; and ds; is 
cos 63; = Q:;/Q.:Q; 

The most useful coordinate systems are orthogonal ones, that is, systems in 
which surfaces always intersect at right angles. We shall limit ourselves 
to such systems in secs. 5.2 to 5.15, returning to the more general 
case of non-orthogonal systems in sec. 5.16. For the present, then, 
cos 6;; = 0, Qi; = 0, and the cross product terms may be dropped from 
(3). The three possible surface elements in orthogonal systems thus become 


dSij = ds:ds; = Q:Q;dq:dq; (9 = 1,2,3; t Æ j) (5-4) 
and the volume element, 
dr = dsıdszds3 = Q,Q2Q3dqidgedg (5-7) 
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5.2. Vector Relations in Curvilinear Coordinates.—If ¢ is a scalar point 
function, Ye must be the same in all coordinate systems, for Ve is a vector 
whose magnitude and direction give the maximum space rate of change of 
$. A component of Vo is its directional derivative (see sec. 4.9) in the 
given direction, thus the component perpendicular to the surface q; = con- 
stant and hence in the direction of s; is 


in accordance with eq. (5). Since it is also possible to regard Y as a vector 
operator, it may be written in terms of unit vectors, uj, Ug, Us along the 
curvilinear coordinate axes. Thus, 

a, ð Ug ð us ð 


v= + + (5-8 
Q: 9g, Q2dq2 Qs dg3 ) 


30 that 
Uy dd Us ð$ Us AP 
Yg s 2 $s p E 5-9) 
Qi ðqı Qedg2 Q3 êq ( 
Any vector may be written in terms of curvilinear components V;, 
Va, Vs: 
V = u Vi + uz Vo + Ug V3 (5-10) 


but in order to find Y * V (see sec. 4.10) in curvilinear coordinates, we must 
know the relation between u4, tg, ug and z, y, z. We proceed by evaluating 
Yy * u, starting with Y X u,, since this is needed to obtain Y - u,. 

Remembering that u,/Qı is the product of a scalar and a vector, we 
may write in view of (4-26) 


i Jo 
vaT atao 


—u, XV G )+2 g, % x u) (5-11) 


the change of sign coming from the change of P ar in the vector product. 
From (9), we note that u;/Qi = Vqi and from (4-30) that 

V XxX yq = 0 
hence, 


a x¥(2) => wx ) 5-12 
Va) Ta er) 


Now using (8) and performing the differentiation, we find 
v(= x)" Zu 0Q) uz dQ) ug 0Q1 
Qı 


z — 5-13) 
Q? ðq — Qa qa QİQ ðq ( 
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When we further recall that 
u Xu, = 0; u X uj = ty (5-14) 
and substitute (13) in (12), we obtain 


u 09; ug 0Q1 
vx = — 5-15 
"i Q:03 dga  QıQ2 8qe ( ) 


The scalar product of Y and a unit vector may be written as 
Via, = V- (to X us) = ug: (V X uz) — u2- (Y X ua) (5-16) 
by using (14) and (4-26). This becomes 


1 9(Q2Qs) 
Vou = 5-17 
Q1Q203 ðq í ) 
when we expand the vector product by (15) and use the fact that 
ww = 1; t: uj; = 0 (5-18) 


In order to determine Ẹ - V in curvilinear coordinates, we see from (10), 
that 


V-V =y. (Vi) +V- (2V2) + Y- (3V3) 
a typical term becoming 
Y- (V) = VNV -u t uw VV: (5-19) 


by (4-26). When V- u; is written in the form of (17), YV; in the form of 
(9) and (18) used to eliminate the scalar products of the unit vectors, the 
three terms of (19) reduce to 


PY = ag ag HO 5p, WO) 
+2 cre (5-20) 
If V = yo, 
V: Vb = VS = aaa oS | 
22) 2884] om 


since the components of Ye are V; = (0¢/dq;)/Q;. 
The curl of a vector in terms of the unit curvilinear vectors becomes 


Vx VHT GV) +9 X (2V2) + y X (Vs) 
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which may be expanded by using (11), to give terms like 
Vx (a,V,) = Viy X u) — a X (VV) 


When three similar equations are added together, the result in determi- 
nantal form is 


1 Qi Qua Quz 
W231 VQ, VeQe VaQ3 


In order to compute VV in curvilinear coordinates, use is made of the 
relation (4-32). 


VV =97(0V-V)-¥VxKUXV 


which may be reduced to the desired form by means of (8), (20) and (22). 
The component of the resulting expression along the u; direction is given by 


i 
EA (V-V) — 000s dq 2 Qy x V)3s] + J; = z z fev Xx V)ə] (5-23) 
where (Y XV) and (Y XV)_ are the components of V XV along us 
and uz. The two other components of ¥°V are obtained from (23) by 
cyclic permutation of the subscripts 1, 2, 3. 

The task of computing any of these vector quantities in special coordi- 
nate systems is seen to involve calculation of the Q; which may be done in 
a straightforward way from (4) provided relations like (1) or (2) are known. 
In the remainder of this chapter we discuss those special systems which 
appear to be most useful. We include all those which may be used to solve 
the three-dimensional Schrédinger wave equation of quantum mechanics. 
It has been shown? that the method of separation of variables (cf. Chap- 
ter 7) is applicable to this equation only if the potential energy is of the 
form 


V = Tf (q:)/Q (5-24) 


and the coordinates have certain special properties. There are eleven 
such systems; these are the ones described in sees. 5.3-5.9, 5.11-5.13 and 
the confocal ellipsoidal system of sec. 5.6 expressed in terms of elliptic 
integrals. We indicate other examples of the use of some of the systems as 
we proceed. In each case, we describe the geometry, give the relations 
between the new coordinates and z, y, z and list the resulting Q; obtained 

? Robertson, H. P., Math, Ann. 98, 749 (1928); Eisenhart, L. P., Phys. Rev. 45, 427 
(1934); 74, 87 (1948). These coordinate systems have been discussed in considerable 


detail by Morse, P. M., and Feshbach, H., ‘ Methods of Theoretical Physics,” 2 vols., 
McGraw-Hill Book Co., New York, 1953. 
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from (4). Calculation of ¥¢, V4, Y X V, etc., may be performed as an 
exercise by the student? (see problems in later sections). 


SPECIAL ORTHOGONAL COORDINATE SYSTEMS 
5.3. Cartesian Coordinates.—These form a trivial case of curvilinear 
coordinate systems. 
G@=Q=Q=1 (5-25) 
5.4. Spherical Polar Coordinates.—The coordinate surfaces are families 
of: (1) concentric spheres about the origin (r = const.), (2) right circular 
cones with apex at the origin and axis along z (@ = const.), (3) half-planes 


Z 


x Fie. 5-1 
from the Z-axis (¢ = const.). A point P(z,y,z) is located by specifying 
the radius r of the sphere on which it lies, its colatitude 9, and its longitude 
or azimuth œ on the sphere. From Fig. 1, it follows that 
x = r sin ĝ cos ġ 
y = r sin 8 sin ọ (5-26) 
z =r cos @ 

3 Some of these quantities for certain of the systems may be found in Pauling and 
Wilson, “ Introduction to Quantum Mechanics,” Appendix IV, McGraw-Hill Book Co., 
New York, 1935; see also Adams, E. P., “Smithsonian Mathematical Formulae,” 
Washington, 1922, and Magnus, W., and Oberhettinger, F., ‘ Formulas and Theorems 


for the Special Functions of Mathematical Physics,” translated by John Wermer, 
Chelsea Publishing Co., New York, 1949, 
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Remembering that ds; = Qidg;:, values of the Q; may also be determined by 
inspection from the figure, thus 

@=1; Gar; Q =r sino (5-27) 

5.5. Cylindrical Coordinates.—-The coordinate surfaces are: (1) right 

circular cylinders which form families of concentric circles about the 


origin in the XY-plane (p = const.); (2) half-planes from the Z-axis 
(@ = const.); (3) planes parallel to the X ¥-plane (2 = const.). A point 


zZ 


Fra. 5-2 


P(z,y,2) is located by giving the distance p in the X Y-plane from the origin 
to the cylinder on which the point lies, the angle ¢ in the X Y-plane and the 
distance on the Z-axis from this plane to the point. From Fig. 2 


z= pcos¢d 
y = psing (5-28) 
z= 2 

@=Q=1; Z= (5-29) 


5.6. Confocal Ellipsoidal Coordinates.—In this system, the coordinate 
surfaces are families of (1) ellipsoids (A = const.); (2) hyperboloids of one 
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sheet (u = const.); (3) hyperboloids of two sheets (» = const.) given by 
the equations 


7? y? z2 
; =] 
a? rT H-nxt2 à 
z? y? z 
— =1 5-30 
Ê — pu P-—p pe? ( ) 
2 2 2 
T y z 
321 


where A, u, v are parameters called ellipsoidal coordinates; a, b, c are con- 
stants; a? >v >b? > p> ce? >A > — oH. It is shown in books on solid 
analytical geometry that intersections of these three surfaces are orthogo- 
nal and that all of them have common foci. Moreover, through any fixed 
point P (x,y,z) there passes one and only one surface of each type. 

The relation between the new and the old coordinates may be found by 
solving (30) directly. It may be done more easily as follows. Consider 
the cubic equation in a parameter q 


+ z= dC = 5-31 
ETIE ETETE (5-31) 
with three real roots, A, u, v satisfying the inequalities just stated. As q 
varies between a? and — œ, (31) describes the complete system of confocal 
surfaces given in (30). On clearing (81) of fractions and equating it to its 
identity, we have 


z? —g(e-gty@—ge—-gt+7@—-gh—4 
-(?~g)0?—-ge’?-g=@-N@-zv)@-v) =O (5-32) 
and this must hold for every value of g. Upon setting q = a?, b?, c? in 
turn, we obtain 
(a? — X) (a? — u) (a? — ») 
(0? — a*)(c? — a?) 
2 _ ( — NŒ — u) — v) 
~ (œ — 0?) (c? — b?) 
2 = (ce? — rA)V(c? — ue — v) 
(a? — 2) (b? — e?) 


g2? 


(5-33) 


Taking the logarithm of (33), differentiating partially with respect to à 
and using (4), we have 


2 =; (a? — u) (a? — v) + (b? — u) @? — v) 
a Alla — (8 — al a o — Ala? — bee — b2) 
(c? — ule — v) | 


(c? — A) (a? — c?) b? ~ e) 


Q 


+ (5-34) 
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Values for a and Q may be obtained in a similar way or from (34) by 
cyclic interchange of (Au). Simplification of the resulting expressions 
yields 


g =i] (= A) — A) 

41 (a? — »)(b? — aj(e? — vd} 

pai (vy — pw) (A — uu) | 7 
Oe GF wy? — a? a) (5-35) 
g-i (à — v) (u — ») | 

” AL (a? — v)(b? — v) (Ê — r) 


It is somewhat laborious to transform (34) directly into the first equation of 
(35) but their equivalence may be verified by writing the latter in terms of 
partial fractions. 

Because of the fact that x, y and z appear as squares in (33), a given 
point P(z,y,z) is not uniquely determined by (A, u,»); in fact, eight points 
symmetrically located relative to the (X YZ)-axes correspond to the set 
(Au). This ambiguity may be resolved by adopting some convention 
concerning the signs of (4,u,7), or in more elegant fashion by the intro- 
duction of elliptic functions. The latter procedure may be accomplished 
either by means of elliptic integrals, Jacobian elliptic functions or Weier- 
strass p-functions.* 

The confocal ellipsoidal coordinate system has proved useful in prob- 
lems of mechanics, potential theory, electrodynamics and hydrodynamics.’ 

5.7. Prolate Spheroidal Coordinates.—Degenerate cases of the preced- 
ing system may arise if two or three of the axes in (31) become equal. 
Additional surfaces are then needed since the resulting equation in q is 
either quadratic or linear. Instead of following a method similar to that 
used for ellipsoidal coordinates, it is simpler to proceed by considering the 
equations of an ellipse and a hyperbola, 


z? x 
a’ ae) 

(5-36) 
2? xr? 


1 Full details concerning these functions may be found in Whittaker, E. T. and 
Watson, G. N., “A Course of Modern Analysis,” Fourth Edition, Cambridge Press, 
1927. 

5 Some references to these applications are: MacMillan, W. D., “Statics and 
.Dynamics of a Particle,” 1927, “ The Theory of the Potential,” 1980, McGraw-Hill Book 
Co:; Kellogg, O. D., “ Foundations of Potential Theory,” J. Springer, Berlin, 1929; 
Mason, Max and Weaver, Warren, “ The Electromagnetic Field,” University of Chicago 
Press, 1929; Milne-Thomson, L. M., “ Theoretical Hydrodynamics,” Macmillan and 
Co.. London, 1938. 
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where a is the semi-major axis and e, < 1 is the eccentricity of the ellipse, 
e > 1, the eccentricity of the hyperbola. If we now substitute a cosh u 
for a and sech u for e, in the ellipse, a cos v for a and sec v for ez in the 
hyperbola and finally z? + y? = 7° for 2, we obtain 


z re 
a? cosh? u + a? sinh? u 
(5-37) 
2 k 
Sot, ~ ae, =H 


a? cos?v =a? sin? » 


with OS u< œ, O0Su< xr., These equations represent the confocal 
families of: (1) prolate spheroids® (u = const.) and (2) hyperboloids (of 
two sheets) of revolution (v = const.) obtained by rotating the ellipses 
and hyperbolas of (36) around the Z-axis. The intersection of these 
surfaces, as shown in Fig. 3, will be a circle of radius r; hence if 0 < ¢ < 2r, 
the addition of (3), a family of planes through the Z-axis (¢ = const.), to 
the spheroids and hyperboloids gives us three suitable coordinate surfaces 
(u,v,6). We may then solve (37) for z and r and simplify the resulting 
expressions by means of the relations between trigonometric functions. 
Finally, we set z = r cos ¢, y = 7 sin ¢, obtaining 


x = a sinh u sin v cos o 
y = a sinh u sin v sin 6 (5-38) 
z = a cosh u cos v 


and from (4), 
Q@ = g = 0° (sinh u + sin? v) (5-39) 
Q; = a* (sinh? u sinf v) 
An important property of prolate spheroidal coordinates makes them 
useful in certain quantum mechanical problems, It is well known from 
analytical geometry that the sum of the focal radii of an ellipse is a constant, 
equal to the major axis. Similarly the difference between the focal radii 
in a hyperbola equals the transverse axis. If r4 and rg are the distances 
from the two foci to a point of intersection of the ellipsoids and hyper- 

boloids, we find that 


ra + rg = 2a cosh u; ra — rg = 2a cosy 


where we have replaced a by a cosh u and by a cos v as before. This pro- 
cedure thus locates a point relative to any two-center problem such as the 
diatomic molecule (see sec. 11.21). It is often convenient to introduce the 


ë Also called ovary ellipsoids. 
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coordinates £ and 7 in place of cosh u and cos v, respectively, so that 


TA + TR Ta — TB 
=- 4E, a= 5-4 
a l” 2a (5-40) 


£ 


In terms of these variables, the volume element may be seen to take the 
form 


dr = a (E? — n’)jdidnde 


5.8. Oblate Spheroidal Coordinates.—When ellipses are rotated about 
their minor axis, the resulting surfaces are oblate spheroids.’ If we rewrite 
(37) so that the axis of revolution is again the Z-axis, but now® the minor 
axis of the ellipse, we have 


2 2 


r z 


a cosh? u a sinh? u 
(6-41) 
r? 2 


aê sin?» =a? cos? v 


withOSu<0,0SvS7,2 =reos¢y=rsng,OS¢S 2r. The 
coordinate surfaces are thus: (1) oblate spheroids (u = const.); (2) hyper- 
boloids (of one sheet) of revolution (v = const.); (3) planes through the 
Z-axis (@ = const.). From (41), we find 


x = a cosh u sin v cos ¢ 
y = a cosh u sin v sin ¢ (5-42) 
z = a sinh u cosv 

and from (4), 


Qu = Q 
Q 


The geometry of the system may be inferred from Fig. 3 by suitable inter- 
change of the X-, Y- and Z-axes. 

5.9. Elliptic Cylindrical Coordinates.—If (37) is again rewritten with 
z? in place of 2? and y? in place of r°, the loci of these equations are cylindri- 
cal surfaces, whose elements are parallel to the Z-axis and perpendicular 
to the XY-plane. Their intersections with this plane are ellipses and 
hyperbolas. The coordinate surfaces are: (1) elliptic cylinders (u = 
const.); (2) hyperbolic cylinders (v = const.); (3) planes parallel to the 


a? (sinh? u + cos? v) 


a? cosh? u sin? v (5-43) 


7 Also called planetary ellipsoids. The figures of the earth and of the planet Jupiter 
are approximately of this form. 

8 At the risk of some confusion, we have interchanged axes in this system and in 
some of the following ones so that the Z-axis is always the axis of revolution. 
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XY-plane (z = const. 


Qi = 


CONICAL COORDINATES 6.10 
). Proceeding as before, 


z =a cosh u cos v 


y = asinh usin v (5-44) 
Z=2 
Q? = a? (sinh? u + sin? v); Q? =1 (5-45) 


The intersection of these cylinders with the X Y-plane may also be inferred 


from Fig. 3. 


OX 
SSS 


P(x, y, 2) 


Fie. 5-3 


5.10. Conical Coordinates.—A further degenerate case of the system 
of sec. 5.6 arises when the orthogonal sets of surfaces are: (1) spheres with 
centers at the origin and radius u (u = const.); (2) cones with apexes at 
the origin and axes along the Z-axis (v = const.); (3) cones with apexes 
at the origin and axes along the X-axis (w = const.), their equations being 


>> b >w. 


roy? + 22 = y? 


z y z . 
e+e Re” Fem (5-46) 
r y? z2 


eee ae 


The projections of the surfaces on the XY Y-plane are 
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families of circles, ellipses and triangles. From (46), we find 


a uw? 
be? 
u? (o? — b?) (w? — b2) 
= aA (5-47) 
2 B u? (u? — c?) (w? — e) 
Ele B) 


and from (4) 


u?(v? — w?) u? (v? — w?) 


. = Oe 
Q= l; Q = W? — b?)(c? — w?) ° Qe = (w* — b?) (w? — e) 


(5-48) 


5.11. Confocal Paraboloidal Coordinates.—A system similar to that of 
sec. 5.6 has coordinate surfaces consisting of confocal families of: (1) 
elliptic paraboloids extending in the direction of the negative Z-axis 
(à = const.); (2) hyperbolic paraboloids (u = const.); (8) elliptic parabo- 
loids extending along the positive Z-axis (v = const.). The equations 
for the surfaces are 


g? y? 
A= 
Pont poy? @t 0 
a Yd ty =0 5-49) 
a — u u— b 2 u= ( 
g? 2 
toan 


where ~œ <A <b? <p <a®<yv< +. Proceeding as in the con- 
focal ellipsoidal system, we may write the cubic equation in q, 


r y? 
m a (5-50) 
with three real roots, A, u, v. As g varies between — o and + œ, the com- 
plete system of confocal surfaces (49) will be described. On clearing (50) 


of fractions and equating it to its identity, we have 
a? (b® — g) + y®(a® — q) + (22 + gala — 9) (b? — g) 
= (q — A) — u) — r) =0 (5-51) 


Expressions for z? and y? may be obtained from (51) by setting g = a? and 
b? in turn: the result for z is found by equating the coefficients of q? on 
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both sides of (51). We thus have 


2 CNE- ye- y) 


(b? — a) 
_ (b? — A)? — w) (0? — ») 
y? T (a? _ b?) (5-52) 
z = b(a? b? — A uo) 
and 
g: 1 WONG =») 
* 4 (a2 — AO — A) 
Q = 1 v—2#)Q-— 4) (5-53) 


“4 (a? — 4) (0? — u) 
2 1 A — v)(e— ») 
Q ~ 4 (a? — v)(b? — v) 


Because of the appearance of x and y as squares in (52), a point P(2,y,z) 
corresponds to four points P(\,y,») symmetrically located with respect to 
the XZ- and YZ-planes. As in the confocal ellipsoidal system (sec. 5.6) 
the ambiguity may be removed by the use? of elliptic integrals. 

6.12. Parabolic Coordinates.—If two roots of (50) become equal, the 
preceding method fails since there are now only two surfaces. In this case, 
consider the families of parabolas 


z? = (z + £/2) 


z? = —2n7(z — 97/2) (5-54) 


The vertices of all parabolas lie on the Z-axis at distances —¢?/2 and 
9/2, respectively, and all of them have a common focus at the origin of the 
Cartesian coordinate system. If we now rotate these parabolas about the 
Z-axis, the resulting intersections are circles and the paraboloids of revolu- 
tion are still given by (54) if we replace z? by r? = 2? -+ y?, £ = T cos ¢, 
y=resin¢. We thus cbtain 


z = & cos } 
y = & sin ¢ (5-55) 
z = (n? — #)/2 


and from (4), 
Q = Q = (PH) 
Q = Pr? 
9 See, for example, Maxwell, J. C., “ A Treatise on Electricity and Magnetism,” 


Vol. I, Third Edition, Oxford Press, 1904, p. 240. Application of this system is also 
described there. 


(5-56) 
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The coordinate surfaces are: (1) paraboloids of revolution extending in the 
direction of the positive Z-axis (— = const.); (2) paraboloids of revolution 
extending toward the negative Z-direction (n = const.); (3) planes through 
the Z-axis (¢ = const.). Intersections of these surfaces with the XZ- 
and XY-planes are shown in Fig. 4. Parabolic coordinates have been 
used in the treatment of the Stark effect.'® 


x 


P(x, y, 2) 


Fra. §-4 


5.13. Parabolic Cylindrical Coordinates.—A system similar to elliptical 
cylindrical coordinates is obtained by adding planes to the parabolic 
cylinders represented by (54). If we replace z by y in those equations, 
we have 


= (n? — #)/2 (5-57) 


(5-58) 


© Schrödinger, E., Ann. Physik 80, 457 (1926); Epstein, P, S., Phys. Rev, 28, 695 
(1926) 


187 BIPOLAR COORDINATES 5.14 


The coordinate surfaces are: (1) parabolic cylinders (€ = const.); (2) 
parabolic cylinders (n = const.); (3) planes (z = const.). The intersec- 
tion of these surfaces with the XY-plane is like the system of confocal 
parabolas shown in Fig. 4. 

5.14. Bipolar Coordinates.—Before considering this system, we list a 
few relations which are needed in the subsequent discussion. In terms of 


exponentials, we may write 


to. ae 
sin z = 3 — e); cosz = $(e* + e) 


sins i(1 — e?*) 
t = = [a 
an g= roy EE (5-59) 


Replacing z by iz, we have the corresponding hyperbolic functions 


BN i — a 
sin iz = 5 (F — e7) = 7 sinh x 


cos iz = $(e7 + &*) = cosh z (5-60) 
Sf pet 

tan iz = era) = t tanh z 
(e + 1) 
Av 


P(2,y) 


B(-a,o) 


Fie. 5-5 


We shall also need the inverse circular function tan zt =u. Since 
x = tan u, it follows from (59) that 


iu _ & = 2), (i — z) 
G+2)’ G +z) 
and (5-61) 
Za mn . 2, @+2) 
w= tans = pin) 
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Suppose a point P(z,y) is located as shown in Fig. 5 by means of two 
vectors r; and rz and two angles 6; and êg. For different positions of the 
point in the X Y-plane, the vectors are always drawn from the fixed points 
A and B symmetrically located on the X-axis a distance 2a apart. If 
p =z + iy; p = z — ty, then 


z = (p* + p)/23; y= T — p) (5-62) 


The coordinates of the point are 


204 


pmas rye 


. 5-6 
P +a= roe ( 3) 
and from the geometry of Fig. 5, it follows that 
2 2 2. —1 
ri = (@~ a) +y"; 6 = tan™ y/(e — a) 
m= (eta)? +y?; 6, = tan y/(z +a) (5-64) 
Defining new quantities 
f= b nen? (5-65) 
1 
and dividing the two equations of (63) by each other 
pra ix PL +i 
p-a@ f l @ eni (5-66) 
where 
x=é+i (5-67) 


In order to find z and y as functions of £ and n, substitute (66) and (67) 

in (62). When use is made of (59) and (60) the results are 
@ sinh 7 

cosh 7 — cos £ 

B asin £ 

~ cosh n — cos £ 


(5-68) 


To find the form of the coordinate surfaces, we start from the definition 
of £ and use (61) to obtain 


f=lil (iz — ia + y)(iz + ia — y) 
2" (ia — ia — y) (iz + ia + y) 
which may also be written as 


G+) 


z? + y? — a? + aay G — gy = 
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We observe from (59) that the last term of this expression equals 
—2ay/tan § = —2ay cot’. Hence 
z? + y? — a? — Qaycoté = 0 
or 
z? + (y — a cot £)? = a7(1 + cot? £) = a? esc? £ (5-69) 
In the same way we find 
a B Gto ty 
ñ -afty 
and. 
(z — a coth n)? + y? = a? esch? n (5-70) 
We thus see that for ¢ = const., 0 S £ < 2r, we have a family of circles 


with centers on the Y-axis at the point, z = 0, y = a cot &, the radii of the 
circles being a csc & Each member of this family will pass through the 


Fra. 5-6 


fixed points A and B as shown in Fig. 6 and will intersect the circles 
„n = const. orthogonally. The members of the second family have radii 
of length a csch y and are all situated on the X-axis at the points 
zg =acothyn,y = 0. The point A is obtained wheny = + © and B when 
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n = —æ, When n = 0, the circles degenerate into points on the Y-axis. 
The position of a point in the X Y-plane is thus fixed when we know in 
which quadrant it lies and furthermore the constant values of y, £ of the 
circles which pass through it. Since the fixed points A and B (that is, the 
X-axis) divide each circle of the set £ = const. into two segments, we arbi- 
trarily take £ = tọ < a for the are above the X-axis and £ = & + 7 for all 
points below this axis. 

In order to use these circles as a coordinate system in space, imagine 
them to be moved along the Z-axis. Then (69) and (70) represent two 
families of right circular cylinders with axes parallel to the Z-axis. Suit- 
able coordinate surfaces are then: (1) cylinders with centers on the Y-axis 
(E = const.); (2) cylinders with centers on the X-axis (n = const.); (3) 
planes perpendicular to the Z-axis (z = const.). From (68) and (4), 


a2 


(cosh n — cos £)? 


Q = 2 


i 


(5-71) 
Q= 1 


Bipolar coordinates are useful’! in problems of hydrodynamics and elec- 
tricity. 

5.15. Toroidal Coordinates.—If we rewrite (69) and (70) with 2? sub- 
stituted for y? and r? = x? + y? for z?, the resulting equations 


2az cot E = r? +2? — a? 
2,.2 aa a 2 2\2 (5-72) 
datr? coth* n = (rf + 27 + a*) 
represent the families of spheres and tores (or anchor rings) obtained by 
rotating the circles of the previous system about the Z-axis. If we take as 
the third surface planes through the Z-axis, y = const., then 


y/x = tan y (5-73) 


The orthogonal coordinate surfaces are thus: (1) spheres with centers on 
the axis of revolution at distances +a cot £ from the origin and radii, a ese ¢ 
(£ = const.); (2) anchor rings or tores, whose axial circles have radii 
a coth n and whose cross-sections are circles of radii a esch n (n = const.); 
(3) planes through the Z-axis (¥ = const.). The spheres and anchor rings 
have a common circle, r = a,z = 0. With methods similar to those used 

U See, for example, Milne-Thomson (loc. cit.}; Maxwell (loc. cit.); Jeans, J. Hy, 


“The Mathematical Theory of Electricity and Magnetism,” Fifth Edition, Cambridge 
Press, 1925. 
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in sec. 5.14, 
x =rcosý, y=rsny 


r= a sinh q 5-74) 
~ cosh 7 — cos £ ( 
z= a sin & 
~ cosh n — cos È 
2 
Q 
Q? = Q? = 


(cosh n — cos £)? 
(5-75) 


a sinh? n 
(cosh n — cos £)? 


Q&Z = 


This system has found application’? in certain problems of electricity and 
of potential theory. 


Problem a. Show that in spherical polar coordinates: 
Ve 


V-V= Fang {sin 0S PV) +S (in oY) +r | 


ary sin 0S (#5) + + A(sinoS, + Sh 


1 ave 
(V XY) -8 (sin 6Vg) — | 


1 (ets in o EZA) 


rsin 8 


2 = 


(Y xY) = yh sin or 


(VX Ve = ar (Ve) — a 


Problem b. Show that in cylindrical coordinates: 


7 p \Op P Op p õp? P oz” 
Problem c. If V is the potential energy and m is the mass of a particle show that 


Newton’s laws of motion become: 
(1) in spherical polar coordinates 


m{r — rô? — r sia? eg} = —OV/dr 
ld 1d 
“ta — r sin 0 eos 04*| = -1 
1 av 
29 = — ae 
nf: sin 6 di Aa sin ah rsin 6 Op 


12 See Hobson, loc. cıt., for reterences. 
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(2) in cylindrical coordinates 


av 
mG — p) = 3" 
ld at = 2 
moe) pp Op 
mz = ov 
p ðz 


NON-ORTHOGONAL COORDINATE SYSTEMS 


5.16. Tensor Relations in Curvilinear Coordinates.—When the coordi- 
nate surfaces of a curvilinear system are not orthogonal, the methods of 
tensor analysis prove convenient (see sec. 4.20 ff.). The relations which 
we are about to derive are more general than those obtained in the first 
part of this chapter; in fact, we will show that the two formulations of 
the problem become equivalent for orthogonal coordinates. 

Let (x! ,x?,2°) be the usual Cartesian coordinates of a point and (g 040°) 
be its curvilinear coordinates, as discussed in sec. 5.1. Then in tensor 
notation,*? eq. (3) becomes 


ds? = gijdq'dq’ (5-3a) 
where 
aa aa 4 
Jij = ag? ag? = ji (5-4a) 


is identical with Qi; of eq. (4). The line element is clearly 
ss = V gudg?; (2 not summed) (5-5a) 


In order to find the surface element, we recall from sec. 4.5 that a sur- 
face may be represented as the vector product of two other vectors. Thus 
let dsg be an infinitesimal displacement at the point (q',g’,q*) along the 
coordinate line g? and ds; be a similar displacement along the line g. 
Then the vector dS; = ds. X ds3 is perpendicular to the plane ¢ = const. 
and its magnitude dS, is the desired surface element in that plane. Before 
we can obtain the appropriate expression for it in terms of the tensor gi; 
we must digress in order to consider two important systems of vectors in 
curvilinear coordinates. Suppose r = eee is a vector and 


= oT gg 
dr = 5 + hay +3 = dg 


13 We generally use the summation convention throughout the rest of this chapter. 
In certain cases, repeated indices are not to be summed; such exceptions should be 
obvious to the reader. 
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is a small displacement. If we define three vectors 


or 
&; =- 
aq 
then it is clear that we may write 
dr = edq (5-76) 


These vectors, e, which we call base vectors,* are directed tangentially 
along the coordinate curves but they are not necessarily of unit length. 
While it is usually more convenient to resolve an arbitrary vector A into 
components which are multiples of a unit vector we may also write 


A = ae; (5-77) 


and the three scalars a’ are the contravariant components of A. Let us 
define another set of base vectors 
e e es; xe €i xe 
et = KS. 2 _ Ss MO os _ Si A (5-78) 
v v v 

where v is the scalar triple product [e,e9e3] of sec. 4.6b. These vectors are 
perpendicular to the planes of e2, 63; €, € and €, ez, respectively, and it 
is easily seen that 


e”: en = ô (5-79) 
Furthermore, it is true that 
e? xe e? X et 
&, = y ; & = z ; 
(5-80) 
e! X e? 
e3 = ——>— where v’ = [e1e?e?] and wv’ = 1; 


hence the two sets of vectors e” and e, are said to be reciprocal to each 
other. In terms of the reciprocal set?” (76) becomes 


dr = e'dq; (5-81) 
and (77) becomes 

A = ae! (5-82) 
where the a; are the covariant components of A. 


14 The systems of base vectors introduced here are treated by matrix methods in 
see. 10.10. 

15 Many interesting properties of reciprocal systems are presented by Gibbs-Wilson, 
loc. cit., pp. 81-92. 
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If we equate (76) and (81) we obtain 


dr = edq? = e’dgq; (5-83) 
and if we multiply by et or e; we find, because of (79), that 
dq = e . eidq;; dq; = €;° ed (5-84) 


Since the square of the distance between two points is given by ds? = dr -dr 
we see from (83) that 


ds? = e;- e,dg'dg = e! - e'dg.dq; (5-85) 


We may therefore identify the scalar products of the base vectors with the 
tensors g” and gij 


gij = ei- ez; g7 =e'-e? (5-86) 
For later use, we also note that we may equate (77) to (82) 
A = aet = a'e; (5-87) 
and use (79) and (86) to write 
a; = gija; a = gia, (5-88) 


We also have from (87) the equivalent expressions 

a,=A-e; a =A-e 
hence (87) may be stated in the alternative form 

A = (A-e,)e’ = (A-e’)e; (5-89) 
We now have several relations by means of which we may find either the 
contravariant or covariant components of an arbitrary vector A. If we 
wish to know the components in terms of unit vectors, tangent to the qf 


coordinate lines, we recall the equation defining the length of a vector 
(sec. 4.4) and see that the appropriate unit vectors are 


w= Ve; e; 7 Vga 
Therefore, any vector A may also be written as 
A = Aw; 
where 
A; = Vgua’ 


If needed, similar equations could be given in the reciprocal system. 
Let us now return to the problem of the surface element in curvilinear 
coordinates. Since ds; = edq’, we have 


dS, = ds. X ds3 = (€2 X ea)dgd@ 
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and 
dS; = [ez X e3) + (ez X e3)]'/2dg*dg? 


It is easy to show (see Problem a, sec. 4.6) that the scalar product inside 
the brackets becomes 


(€z - @2) (e3 + €3) — (€ - €3) (ez > e2) 

Thus when we use (86) we obtain 

dS, = Vg22933 — gzdg’dg? (5-6a) 
Similarly, surface elements on the planes g? = const. and È = const. are 

dS: = Vgugss — gisdg' dg? 

dS3 = V g11922 — gi2dg'dq? 

The volume element, 
dr = ds, - dsg X dsa = [e,eces]dqidq2dq? 

If we place A = ez X e3 and use (89) we get 


A = €; X ez = [etegesje; + [e7ezes]e2 + [e*eres]es 
Now by means of (4-18) and (78) we eliminate e’ to obtain 


€i 
[e1e263] =e, A = leice] {(€2 X €z -€z X €3)ey 


+ (ez X €r -€z X €z)ez + (e1 X €z- €z X ez)ez} 


Finally we expand the scalar products within the brackets using again the 
result of Problem a, sec. 4.6 and getting 


[e1263]? = €; - €;[(€2 - €2) (€z < €3) — (€2 - e3) (ez - €2)] 
+ €1* egl (€z + e3) (€g - €1) — (e2 €1)(€3 > €3)] 
+ @1 + egf (€z - €1) (@3 - €2) — (e2 e2) (€3 - €1)] 


By means of (86) we may replace the scalar products in this equation by 
the g:;, finding that [e,e2¢3] = Vg where g is the determinant of the com- 


ponents of g;; and the volume element becomes 


dr = Vg ddod (5-7a) 


5.17. The Differential Operators in Tensor Notation.—We have seen 
in sec. 4.20 that the components of the gradient of a scalar point function ¢ 
are ðp/ðq!. The direction of dg’ is determined by the vector ds? = eidg? = 
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dg /e? ar, since e? = g”e;, we have in curvilinear coordinates 
12 ij 
Ve = é- = g er (5-9a) 


The divergence of a vector V in terms of its contravariant components is 
the covariant derivative. Thus, from (4-71) 
avi 


-V = Vi; = 
y n= 5g 


+ Vilija} (5-90) 


Now according to (4.63) 


ve i }OGek | Jik — Oa; 
Lisa} = 39” (se + “ag - sag) 
but 
ik Ise L ei IGG 
ag’ age 


since we may exchange the dummy indices i and k. Moreover, g* = g% 
and gg = 93 so we may cancel the second and third terms in 
fiji. Finally, we refer to (4-59) and the rule for differentiating determi- 
nants (see sec. 10.4) to prove that 

ô BS . 

oF = QÏ = gg” 

O9%5 

The Christoffel symbol therefore takes the form 

we OGin Lag _ 1 alo) 


aq ag ag? Vg a? 


V-V= =; 
agi Vg ag 
VG ag? g ( a) 


A similar expression may be obtained in terms of covariant components 
of V. r 

If V= V¢, the contravariant components of the gradient are 
yV’ = V -e by (88) and by (9a) 


op ij de 
= =g 553 
q 


Vi = e- ejg” 
39 aq 
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Substituting this result in (20a) we find for the Laplacian 
1 ô . rnd 
V3% = = [ev g 2] (5-21a) 


The final expression we wish to derive here is the curl. Defne the 
covariant tensor of rank two 


Vy = aV; _ ov; 
ôg ag’ 
If we transform to a new coordinate system we see from eq. (4-53) that 
= ôg? dq’ 
mn = ag” ag” ij 


This tensor which is invariant to such a transformation is the curl in 
curvilinear coordinates. According to its definition, it is skew symmetrie, 
hence the only non-vanishing components are Vie, V23 and V3;. In terms 
of the base vectors we write 


V XV = Vilel X e?) + Vagle? X e?) + Vai (e? X et) 
We have shown in (78) how to convert the ef into the reciprocal base 


vectors and we have also proved that v = [e,ece3] = Vg. With these 
changes, the curl of a vector V is* 


1 OV; a g Va] 
yxy all ag ag flag? agi? 
GAZ ôV 
+ z n aa e| (5-22a) 


It is a simple matter to see what happens when the coordinate surfaces 
are mutually orthogonal. In that case, the vectors e;, €z, €3 are also 
perpendicular to each other and e’ is parallel to e;. Moreover, 

€; _ ê; 
e; & 7 Jii 
and e- €z = €2'€3 = €3'e, =0. It thus follows that g;; = 0 unless 
i = j; in the latter case, 


e = 


gis = 1/9" 
Remembering that g,; is then identical with Q as used in the first parts 
of this chapter, equations such as (3a), (4a), etc., in secs. 5.16 and 
5.17 will reduce to the corresponding equations which appeared earlier 
in this chapter without the letter a. 


Problem. Derive by the tensor method the results of Problems a, b, c, of sec. 5.15. 


* Note that this tensor differs in sign from the conventional curl of vector analysis. 


CHAPTER 6 
CALCULUS OF VARIATIONS 


One of the elementary problems of the differential calculus is to find 
the maxima and minima, that is, the stationary values, of a function y(z). 
The necessary condition for the occurrence of a stationary value at x = a 
is that y (a) = 0. Sufficient conditions that it shall be a minimum or a 
maximum. are, respectively, y’ (a) > 0 and y” (a) <0. The calculus of 
variations deals with a similar, but a more complicated problem, that of 
finding a function y(x) such that a definite integral, taken over a function 
of this function, shall be a maximum ora minimum. The simpler parts of 
this calculus, to which this chapter will be primarily devoted, deal with 
the necessary conditions that the integral shall be either a maximum or a 
minimum; in other words, that it shall have a stationary value; sufficiency 
considerations as well as criteria for establishing the maximum or minimum 
character of the solutions are not important in many physical applications, 
For these, the reader should consult the more comprehensive treatises on 
the subject listed at the end of this chapter. 

6.1. Single Independent and Single Dependent Variable.—Let it be 
desired, then, to find that function y(x) which will cause the integral 


T3 
f I (a,y,ye)de 


to have a stationary value. The integrand J is taken to be a function of 
the dependent variable y as well as the independent variable z and 
Yz = dy/dz. The limits x, and z, are fixed and at each of them, y has a 
fixed value. The integral over J takes on different values along different 
paths connecting the points (71,41) and (#2,ye); one of these paths is 
labeled Y(z) in Fig. 1. We assume that it is either largest or smallest 
along y(x), for example. The paths Y (x) which are admitted for compari- 
son shall be “adjacent” paths covering a small neighborhood of the 
stationary path y(x), that is, Y (z) — y(x) shall be infinitesimal! for all values 
of x between zı and zz. 
We define: 
by(z) = Y(xz) — y(x) (6-1) 
sI = 1(z,Y, Y.) — I(2,y,yz) (6-2) 
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The symbol 4 is called variation; it represents the increase in the 
quantity to which it is applied as we pass from the stationary path to the 
comparison path at the same value of z. Thus, clearly ic = 0. Further- 
more, 


(6-3) 


This shows that the symbols ô and d/dx “ commute.” Since Y and y are 
adjacent, it follows from (2) that 


ôl = I(x, y + dy, ye + dys) — I(2,y,yz) 


In words, the formal rules for computing variations are the same as those 
for computing differentials. 
In terms of this notation, the condition that S Idz be stationary is 


easily written down. It is simply that the integral along y shall yield the 
same value as that along y + ôy, 


f sIdz = 0 (6-5) 


This is of course the analogue of the condition in the ordinary calculus 
that y(x) be stationary, i.e., dy = 0. With the use of (3) and (4), eq. (5) 


becomes 
=F ar ol d 
Sls + en ee = 


6.1 CALCULUS OF VARIATIONS 200 


The second term of the integrand yields after partial integration 


Tz 


TG: 25) ajde +Z] 
dz dy, Yor ay An 


But the integrated part vanishes at both limits because dy, = dye = 0. 
Hence the stationarity condition becomes 


= fal od al , 
f (= -2 L) bydx = 0 (6-6) 
z \OY dz ôYz 
While the vanishing of an integral does not in general imply that the inte- 
grand is zero, we may nevertheless conclude here that 


e =0 (6-7) 


This is because the parenthesis in (6) is multiplied by an arbitrary though 
infinitesimally small function of z, namely dy. For if the left-hand side of 
(7) were not zero for every z, it would have to be positive m some range and 
negative in another range in order to satisfy (6) with a positive dy. We 
may then choose dy to be positive where the left side of (7) is positive and 
negative elsewhere, an arrangement which would violate (6). Hence 
eq. (7) follows and is the condition we are sceking. A function y which 
satisfies that differential equation is called an extremal. Among these 
extremals the minimizing or maximizing curve y will be found, provided 
it exists. 

Eq. (7) was first derived by Euler; it is called the Euler equation 
associated with the variation problem. It may be written in a different 


form: 
or d ol 
— — —{ I — ys— l= 0 = 

ðr 2 Y 5 (6-7a) 


which is useful when J does not depend explicitly on x, for then (7a) shows 
that 
ol 
I — Ys — = const. 
Yz 
represents an extremal. The identity of (7) and (7a) is at once established 
by noting: 


aI al ol al 


de on Yay? O dy, 


Examples. 
a. Geodesics. It is usually taken for granted that a straight line is the 
shortest distance between two points in a plane. The calculus of variations 
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provides a formal proof of this assertion. The element of distance in 
Cartesian coordinates is given by ds? = dg? + dy”. Hence 


T3 T 
s= f ds = f (1 + y) dae 
Zi Ti 


If this is to be a minimum, Euler’s equation (7), with Z = (1 + y2 
must be satisfied. Hence 


dfi ô 2 
=|) Sy 2 | -0 
Ex + Yz) 
or 
Yz 
— 5 = const. 
Vi+¥ 


which means dy/dx = const. 

The minimizing curve is the straight line passing through the points 
yı and y>. Had we chosen polar coordinates, the problem would have 
been to find r as a function of p such that 


P2 
= 2 29 2)1/2 _ 2 251/2 
s= far + rae) SEHR 


is stationary. The Euler equation then reads 


r d To 
(Prey? de Hi 
This reduces to 
2 


Tee — 2 — r 
(r? + re, )3/ 2 
The expression on the left is simply the curvature of the curve in polar 
coordinates; hence the result is the same as before. 
The element of distance on the surface of a sphere of radius a is given by 


ds = a(dé” + sin? dg’)? 


If we wish to find g as a function of 8 such that s is stationary, we must solve 
(7) with I = (1 + sin? @¢3)1/?: 


al sin? ZA | -0 
da (1 + sin? 8o22] 


When the bracket is put equal to a constant, c, we get 


=0 


c cosec? ĝ 
Pi = 
° (1 — Ê — e cot? ay? 


and on integrating 
p = a — sin”! (k cot 8) 
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a and k being new constants. To interpret this result we write it in Car- 
tesian coordinates, using z = a cos ô. Wehaveaccot@ =asm(a — g), 
or, on multiplying by sin 6, 


kz = z sin a — y cosa 


This represents a plane passing through the origin and hence cutting the 
surface of the sphere in a great circle. The shortest (and also the longest) 
distance between two points ou the surface of the sphere is the are of the 
great circle connecting them! 


b. The Brachistochrone. 

A problem .which held the fascination of mathematicians for several 
decades of the 17th and 18th centuries is that of finding the path on which 
an object, in the absence of friction, will slide from one given point to 
another in the shortest (brachistos) time (chronos). John Bernoulli 
proposed the problem in 1696; both he and his brother James, and also 
Newton and Leibnitz, found the correct solution. The path, which 
happens to be a cycloid, is known as the brachistochrone. 

Let the particle start from rest at the origin; the terminal point of the 
motion is (teye). In working this problem it is convenient to extend the 
Y-axis to the right and to measure x downward. Then from the principle 
of conservation of energy, 

dmv? = mgr 


where v is the velocity of the particle at any point of its path, m its mass and 
g the acceleration of gravity. Hence, since 
ds Vde + dy? 
dt d 
1 251/2 
a- 4t Ye) 
(2x)?! 


The integral to be minimized is therefore 


_ Ta 1/2 
V2gi = f (= dz 
o T 


Euler’s equation reads 


v= 


£ 


@__se o 
dz [0 +A ~ 


oy =, 
(1 + yz) E-a) 


c 


Hence 
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If we introduce the constant 2a = 1/c, integration leads to 


y =acos! (1 — *) — (Qaz — x7)? + d (6-8) 
But the new constant of integration, c’, must be zero in order to make y 
vanish at z = 0. Eq. (8) represents the equation of an inverted cycloid 
with its base along Y and its cusp at the origin. (Cf. Fig. 2.) The con- 
stant a must be so adjusted that the cycloid passes through the point 
(t2,y2). The path will also be a cycloid if we allow the particle to fall with 
a finite initial velocity, as the reader may verify. 

Y 


% Ye 


X 
Fia. 6-2 


c. Minimum Surface of Revolution. 


The soap film problem discussed in sec. 2.2i may also be solved by the 
method outlined above. Whatever the function y, the surface generated 
by revolving y about the X-axis has an area 


T2 
2 f yds = an fya + 2) Pde 


If this is to be a minimum, eq. (7a) requires that 


2y1/2 L ap? ay—1j2_ Y 
yA + yz) yyz(l + yz) + ye a 


dy y z 
ae NG l, y = a cosh ( +8) 
an expression which is identical with our former solution of the soap film 
problem. 
Problem. Solve Example ¢ with the use of equation (7). 


6.2. Several Dependent Variables.—The foregoing simple considera- 
tions may be generalized in several obvious ways. In the present section 
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we shall suppose that the integrand J occurring in the integral to be 
minimized or maximized is a function of one independent but several 
dependent variables. In almost all examples relevant to this situation the 
independent variable is the time, while the coordinates are dependent. 
In view of this fact we shall modify our notation, using ¢ in place of the 
former z, and z, y, z, etc., in place of the former y. We wish to find the 
functions z(t), y(t), z(t), --- which make the integral 


by 
f L(t,0,y,2, °° * LyYeZe, °°) dt 
ti 
stationary. The Euler condition is desired as before; we must require that 


ta 
slat = 0 (6-9) 


ty 
But in this case 
3I ôl; tayp lazh êl wis ios 4 
= — $+ — — eee bee — 5y, + —~ tae 
Ox oy y Oz ÔT: mt Ou; ut GEZ 7: 


In computing the integral (9) we again perform partial integrations in the 
second group of terms; for example: 


“al "ar d E | 2da fal 
— òrdi = — > (rdi = | — be] — f 5 (= brdt 
J at, n Oti A ax, da ndi a) T 


As before, 6x vanishes at both limits. Hence (9) becomes 


f[G-$2)=+(@-22)a 
th dx dt «Oz; dy = dt dy, 
oF d ð 
g ia) et |a =o 


If ôz, dy, 6z are entirely arbitrary and independent functions of ¢, each of 
the parentheses occurring here must vanish separately. Hence we obtain, 
in place of the one Euler equation (7), as many as there are dependent 
variables: 


al od at) ol dal 
ar dt dt, əy dəy 
(6-10) 
al dar. 
an TORT 


6.3. Example: Hamilton’s Principle-—The elementary formulation of 
the laws of mechanics is Newton’s; it involves in an essential manner the 
concept of force. Numerous other formulations based on different funda- 
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mental ideas, particularly the energy concept, have been proposed through- 
out the history of the subject. The most important of these is Hamilton’s 
principle. It should be regarded not as a consequence of Newton’s laws 
of force (although it can be shown to be consistent with thern) but as a 
parallel fundamental postulate of mechanics which may be useful in cases 
where Newton’s laws are cumbersome in their application. The principle 
takes for granted a knowledge of the kinetic energy, T, of the mechanical 
system as a function of the coordinates and their derivatives, and also of 
the potential energy, V, as a function of coordinates and possibly the time. 
From the functional form of T and V it then permits the deduction of the 
coordinates as functions of the time. 
The principle postulates that the integral 


fw — V)dt 


shall have a stationary value. The integrand, T — V, is called the Lagran- 
gian function. We shall consider only conservative mechanical systems, 
that is, systems for which V is a function of the coordinates only. 

Let us first treat the motion of a simple mass point in three dimensions, 
using rectangular coordinates for its description. Then 


T = m + yf + 2) 
and, 
V = V(z,y,2) 


so that 
I = gm typ +e) y 
Eqs. (10) are then seen to be Newton’s laws of motion: 


RA 
Ox 


a ) aV d ) GAA ° 
Ë myy -2 Ë =- Z 
g” ôy’ at mee dz 


5 (mz) = — 
An advantage of Hamilton’s principle becomes apparent when the 
problem is such that another system of coordinates is more natural for its 
solution. In that case Newton’s laws require the transformation of the 
force components to the new coordinates, which is sometimes inconvenient, 
while the scalars T and V are more easily transformed. Thus consider the 
motion of a particle in a central field of force, that is, V = V(r). Using 
polar coordinates we have 


T =T (+ó) V = Vir) 
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Dependent variables are r and y. Hence Euler’s equations read 


d ov 
z0- mro = — z 


d 
at (mrp) = 0 


The first of these is the well known radial equation of the problem of 
planetary motion (—@V/ar = const./r?): the term mre? represents the 
centripetal force, which appears automatically in this theory. The 
second equation is Kepler’s second law for it states that r?(dp/dt) = const. 
Its meaning is obvious when it is remembered that the area swept out by 
the radius vector is 47“ (de/dt). 

Turning now to the consideration of more complicated physical systems 
containing more than one mass point, we first introduce general coordi- 
nates, q1, 92; 92, * * * Qn, Where n is the number of degrees of freedom. V will 
be a function of the q’s, but it will not depend on the gq. The kinetic 
energy, T, however, will be a function of the q’s as well as the grs (except 
when Cartesian coordinates are used): 

Hamilton’s principle then states that 


fy 
J SLT (192° + Angar «+ Qnt) — V(qr «++ gadt = 0 


Eqs. (10) become! 


or dad oT avs. 

aq; dt aa; Oq’ t=], 2, en (6-11) 
These are the famous Lagrangian equations of motion, first derived by 
Lagrange (without the use of the calculus of variations). 

To illustrate their applicability and also the use of generalized coordi- 
nates we discuss one further example taken from the field of electricity. 
If q is the charge and t = q, the current in a simple circuit which has 
capacitance C and self-inductance L, its total energy at any instant may be 
shown to be 


VESE 


It is clear from the foregoing remarks that the first of these two terma 
may be regarded as kinetic energy T, the second as potential energy V 
provided g is chosen as a generalized coordinate. The intuitive meaning 


Here g; has been written for dgi/dt. 


207 SEVERAL INDEPENDENT VARIABLES 6.4 


of T and V here becomes lost, as it does in many problems of advanced 
dynamics. Lagrange’s equation for the present case takes the form 


and this will be recognized as the differential equation describing the 
natural oscillations of an electrical circuit having no resistance. More 
complicated examples of the application of Lagrange’s equations to electri- 
cal and indeed even thermal phenomena are available. ? 


Problem. For a simple harmonic oscillator, V = 4kz?. Use Hamilton’s principle 
to obtain its equation of motion. 


6.4. Several Independent Variables.—Next, it is necessary to extend 
the simple theory so as to permit the integrand to contain several inde- 
pendent variables. The problem then is to find a function u(z,y,z) such 


that 
y2 £? 
SOSS E Evzusunu)dziya 
Tı Yi zy 


is stationary. Here we are treating x,y,z as independent variables, u as the 
one dependent variable, and we define again: u, = du/dz, ete. As before, 


we require 
f f f ôIdzdydz = 0 (6-12) 


Here u represents the increment incurred in the passage from the 
extremal u to some neighboring function U, z, y, and z being held fixed. 
Hence ôr = dy = 62 = 0. Therefore 


al el al ol 
êl =— å — bt, +— 5 — ô 
du uta 0 + ou, My t u, tz 


al 
In evaluating an integral like f f f va éu,drdydz we first perform the 


integration with respect to x, obtaining 


ag zg I 
f 2 E o- f 5 (Z) uas 
zı OU, AT x, ÔT \OUz 


1 


and (12) reads 


or ô ôl ð al 
SJJ of 8 A È T) outed 
ðu ÔT du, Jy ðu, z du, 
2 See Thomson, J. J., “ Applications of Dynamics to Physics and Chemistry,” 


Macmillan Co., 1888. A less extensive account may be found in Lindsay and Margenau 
“ Foundations of Physics,” John Wiley and Sons, 1936, pp. 188-212. 
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The corresponding Euler equation is therefore 


of ð al a al a al 


au a2 us dy u, 3z du, (6-13) 


If, in addition to u, there are other dependent variables v, w, ete., eq. (13) 
is augmented by other equations in which u is replaced by v, w, ete. 


Examples. 

a. Let us find the function u(z,y,z) which has a minimum average value 
of the square of its gradient in a certain region of space. Although this 
requirement seems artificial at first sight, it is nevertheless of considerable 
significance in electrostatic and quantum-mechanical problems. If 


f f J (Yu)?dzdydz 


is to be stationary, I = u + uz + u? (ef. Chapter 4 for the definition of 
the operator V), and (13) becomes 


Urr + Uyy + Uzz = V7u =0 


This is Laplace’s equation which must be satisfied, for instance, by the 
electric potential in free space. (Cf. Chapter 7.) 

b. Vibrating String. Leta string of length l be under tension F. When 
it executes small vibrations, it suffers the displacement u(x) at right angles 
to its length, which will be taken along z. For any distortion, l changes to 
U, and 


l 
V = | VIF dae 
0 


If the distortion is small, the integrand may be expanded to read 1 + 4u2, 
so that, 


l 
Ui=l+ 4 f uzdz 
0 
The potential energy of the entire string will then be 
l 
V = FU — Fl = $F f wide 
0 


provided the tension F is not changed by the small displacements u(r) 
The kinetic energy is, clearly, 


l 
T = 4m f dan 
0 
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if m represents the mass of the string per unit length, considered constant. 
Hamilton’s principle now states: 


la pl 
f f (mu? — LPu2) dade 
ty 0 


shall be stationary. The two variables, z and t, are here to be regarded 
as the independent ones. The Euler equation (13) fer this case is the 
wave equation: 


Ute = — Urr 
m 


6.5. Accessory Conditions; Lagrangian Multipliers.—Problems some- 
times arise in which it is necessary to make an integral stationary while at 
the same time one or more integrals involving the same variables are to be 
kept constant. A typical example, discussed below, is that of finding the 
closed plane curve of given perimeter and maximum area. This example, 
being one of the earliest to engage mathematical interest, has given this 
class of problems the name “ isoperimetric.” 

In general, the presence of accessory conditions can be dealt with by 
means of “ Lagrange’s method of undetermined multipliers,” as follows. 
We wish to find the stationary value of 


f Idr 
provided that 


fia =a, fia = 6z, - -flad = cn (6-14) 


All T’s contain the same variables; the limits are fixed and identical in ali 
integrations, and the integrations may be multiple; in the latter case dr 
stands for a product of differentials. Thec’s are understood to be constants. 

We introduce a set of n constant parameters, A;, Az; --* An, the values 


of which are not at once specified. It is clear that, if f I dr is stationary, 


f Kar 


where K = I -+ Mdai + role + +++ + nln, is also stationary whatever 
the values of the Xs, because of (14). We are thus confronted with a 
problem similar to the foregoing, the minimization (or maximization) of a 
single integral, but with a modified integrand: I must be replaced by 


K =I + El: If now the same steps are pursued in evaluating f ôKdr 
i 
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as were outlined in sec. 6.1, we arrive at the equivalent of eq. (6) in which 
I is now replaced by K. But the passage from (6) to (7) is now obstructed 
because ôy is no longer an arbitrary function: the variations must be in 
accord with the relations (14). One may-say that dy has lost n degrees of 
freedom. But here the unspecified character of the \’s comes to our 
rescue. They are precisely n in number and can be so adjusted that the 
parentheses vanish.* Hence the transition from eq. (6) to (7) is 
permitted in this case as well. The extremals must satisfy Euler’s 
equation 


-=-= =0 (6-15) 


or its equivalent (7a), If there are several dependent and independent 
variables, eqs. (10) and (13) take the place of (15). 

In solving Euler’s equation the \’s which are now presumably fixed but 
unknown appear as constants in the extremals. They may be eliminated 
formally by means of conditions (14), but their meaning can usually be 
recognized more directly at some stage of the solution. 


Examples. 
a. To find the plane curve of fixed perimeter and maximum area. We 
seek that r(¢) which maximizes 
2r 


a =f redo 
Q 


2 
(7? + ro) Pde 


and has a fixed 


Here 
K =P A try? 
so that (15) reads: 


L aret? + RY = 0 


rari? + rye — de 


This leads to 
TT ey — 2r? — 7’ 1 
(r? + rey? A 


The left of this equation will be recognized as the curvature, 1/p; of the 
curve. This is to be constant, hence the curve is a circle with radius 
p=h 

3 For a more detailed discussion of Lagrange’s method of undetermined multipliers 


see Page, L., “ Introduction to Theoretical Physics,” Third Edition, D. Van Nostrand 
Co., New York, 1952. 
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b. To prove that the sphere is the solid figure of revolution which, 
for a given surface area, has maximum volume, The area is 


A= 2r f yds = 2r f y +y" de 
0 


Ven f yar 
o 


K =y ty + yx)? 


since we are here permitted to drop constant factors. As K does not con- 
tain z explicitly, it is convenient to use (7a) instead of (7) or (15): 


8K d 8 

— ——(K — y,—-} = 0 

ox z Y A 
whence 


ak 2y 
K- yT y + dy(t + a — A + ye? = e 


the volume: 


Therefore 


But clearly, y = Oat z = Oand at xr = a, which can only be true if c = 0. 
Hence 
y? tyl H yxy? =0 
or 
y = A0 + yy? 


Solving this for y, we obtain 


which on integration leads to 

-V =o? =T — To 
or, 

(2 = m)? ty? = 


We note that the figure is a sphere with center on the X-axis at x9 and of 
radius À. 

It is possible to work this problem without the use of Lagrangian 
multipliers by means of an ingenious method due to Euler. He uses in 
place of the independent variable z a new one, £, which measures essentially 
the area of revolution formed by the are y(z) between x = 0 and the 
variable point zr = z: 


N 251/2 
t fard dz 
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In terms of this variable, 
2 1/2 
a- (F) - ar] 


$ 
Ver Jf y(L — yy) dg (6-16) 


so that 


Here b represents the value of ¢ when z = a that is, the given area divided 
by 2r. By keeping b fixed the aceessory condition A = const. is auto- 
matically satisfied. This method, while very elegant, cannot be applied 
generally. 

The stationarity condition for (16), if written in the form (7b), yields 


yA — PYD + ySy2(L Puyi = yd a =e (6-17) 


whence 


After integration, 


eeb- QO) 
—2 =|1 {2 
d ce c 
The new constant d must be 1 if the curves are to pass through £ = y = 0. 


To obtain the result in terms of z and y, we substitute for y, in (17) the 
value obtained by solving 


Ye = y1 + yz)" yp 
Eq. (17) then reads 


ya +P =e 


and this is precisely the equation solved above with —c = A. 


c. Wave equation. In sec. 6.4 we have seen that Laplace’s equation is 
the necessary condition that the average of the square of the gradient of a 
function shall have a stationary value. If the same quantity is to be made 


stationary, but with the additional requirement that f u?dzdydz shall have 
a fixed value, another interesting equation results. In that case 


au? du? (2) 2 
r- (a) tG tG) ane 
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The integral to be minimized is therefore f (I + Ma)drdydz. Euler’s 


equation [in the form (13)] then reads 


which is a special forrn of the wave equation, namely, that describing 
sinusoidal waves of a single frequency. (Cf. Chapter 7.) Such a wave 
may therefore be characterized as a disturbance in which the displacement 
u has a fixed mean square value and at the same time a minimum square 
gradient. 

6.6. Schrödinger Equation.—The fundamental equation of quantum 
mechanics (see. 11.9) can be derived from a variation principle, as will 
now be shown. We define an operator, known as the Hamiltonian operator, 
as follows: 


H = —kv? + V(x,y,2) 


The physical meaning of k is seen from the relation k = h?/8x?m where h is 
Planck’s constant and m the mass of the particle whose motion is con- 
sidered; V is its potential energy. We now seek a function y, possibly 
complex, which satisfies the following two conditions: 


f f f y* (Hb dedyde (6-18a) 
y*ydrdydz = 1 (6-18b) 
SSS 


The integrations are taken over fixed domains of z, y, and z. It will be 
supposed, furthermore, that the permissible functions y and y* either 
vanish sufficiently strongly at the boundaries of the volume of integration, 


or take on the same values and derivatives at corresponding points on 
opposite boundaries. 


When this is true, the following transformation may be made: 


ay ay | ay* oy 
kF = xE — La a Ia 
f t ör? dz E a J ðr OZ dz 


The integrated part vanishes. As a consequence 


J J J vv" ydadyde = — f f f (Vu*) + Vydzdydz 


shall be stationary; 
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and condition (182) may be modified to read: 


SS [EEO cw) + V@yew*videdyde = 0 


The function K which appears in Euler’s equation [(15) but generalized in 
accordance with (13) to take care of the fact that there are now three 
independent variables] is 
K = kW, + Wil, + ib.) + Vy — N 

Euler’s equations are (y* and y are both dependent variables !) 

K ə 0K 9 8K 96 OK 

dp arab, dy dy, a2 ave 

aK 0 aK 6 aK 8 aK 

oy* dx dyX ay dy* az ay* 


They reduce to 
—K(bee + buy + Vez) + Vy = Ay (6-19) 


and a similar equation for ¥*. To identify the constant \, we note that 
eq. (19) may be written 


Hy = 


If we multiply this equation by y* and integrate over z, y, z the left side 
becomes the stationary integral (18a), which will be denoted by Æ. The 
right is \ in view of (18b). Hence \ = Æ. With this substitution for A, 
eq. (19) is Schrédinger’s equation. 

This result is worth summarizing. Schrédinger’s equation serves the 


purpose of selecting the extremals y which make f f f y“ (Hp drdydz 
stationary, provided f f f ¢*tdrdydz is held constant. If the latter 


constant is unity, then, Jf f f ¢*(Hy)dzdydz is the energy which appears 


in the Schrédinger problem. Further inspection shows the energy to be a 
minimum rather than a maximum in most cases of physical interest. Upon 
these results is based one of the most powerful methods of obtaining 
approximate solutions of eq. (19)- (Cf. sec. 11.18.) 

6.7. Concluding Remarks.—In concluding this chapter, we note a few 
possible generalizations of the theory given here. In the first place, one 
may remove the restriction that §y = 0 at the limits of integration. This 
means, with reference to Fig. 1, that the curves y(z) and Y (x) do not have 
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the same termini. The integrated term which appears in the partia: inte- 
gration leading to eq. (6) will then no longer vanish, and there arise three 
conditions in place of eq. (7): 


ol d al él ol 
—— —— =90; -= =0; | — = Q 
dy dr dyz 8Yx te, Yz dry 


The second and third of these then serve to fix the arbitrary constants in 
the solution of Euler’s equation. 

A further generalization is needed when the limits z; and z3 themselves 
are no longer fixed. Whenever this happens, introduction of a new 
parameter, in terms of which both z and y may be expressed, reduces the 
problem to the forms here discussed.* The Principle of Least Action 
involves a variation problem with variable limits. Since Hamilton’s 
principle is in general more powerful the former, in spite of its historical 
interest, will here be omitted. 

When the integrand J involves higher derivatives than the first, no 
great complications arise. The Euler equation then contains additional 
terms. The point where our simple treatment has heen most deficient is 
in its omission of all considerations establishing the actual existence of 
maximizing and minimizing curves. It will be recalled that Eulers 
equations are merely necessary conditions. They furnish no assurance 
whatever that the curves sought are indeed present among the extremals. 
For these more mathematical questions we refer the reader to the treatises 
by Bolza, Bliss, and Kneser. 
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CHAPTER 7 | 
PARTIAL DIFFERENTIAL EQUATIONS OF CLASSICAL PHYSICS 


7.1. General Considerations.—The general theory of partial differen- 
tial equations is well beyond the scope of this book and will not be developed 
in a systematic way.’ Attention will here be limited to a small number of 
partial differential equations which are of frequent occurrence, almost all 
of which may be resolved by a powerful method known as the separation 
of variables. Before we proceed to consider specific examples, however, a 
few remarks about the meaning and variety of the solutions are in order. 

The simplest tvpe of an ordinary differential equation, that of the first 
order, has a general solution which contains one arbitrary constant; geo- 
metrically it may be interpreted as a set of plane curves labeled by different 
values of the arbitrary constant. In particular, if the equation is linear, 
there is but one curve passing through a given point, and this is uniquely 
specified when the value of y for some value of z is prescribed. 

The simplest type of partial differential equation is one with two 
independent variables (z and y), and the dependent variable (z), which is 
linear and of the first order. Its solutions represent, geometrically, a set 
of surfaces constructed over the X-Y plane. The question may be asked: 
Is one such surface uniquely determined by requiring that it include a given 
point? If this were true, the manifold of solutions of the differential 
equation, z(x,y), would reduce to a single surface when it is specified that 
the solution shall contain that point. 

This, however, is not the case. For consider the simple equation 
dz/dx + ðz/ðy = 0. Itisclear that any function of the form z = (z — y) 
will satisfy it. This function is not uniquely determined by fixing one 
point of it. The origin, for instance, is contained in all surfaces 
z = e(z — y), and yet every different value of ¢ defines a different surface. 

Neither does a prescribed curve fix a surface. For let it be required 
that the solution z of the partial equation above shall pass through the line 
xz =y in the X-Y plane. This is certainly accomplished by taking 
z = (x — y)”, but there is an infinite number of such surfaces depending 
on the parameter n. It is clear from these elementary considerations that 


1 A more complete discussion is found in the references at the end of this chapter. 


ore 
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in dealing with the solutions of partial differential equations we are con- 
fronted with a variety of functions which far transcends the degree of 
complexity encountered in connection with ordinary differential equations. 
In fact one must not be surprised to find that the complete geometric 
specification of a solution of a partial equation even of the simplest type 
usually requires the fixation of an infinite number of parameters. 

7.2. Laplace’s Equation—An equation which arises in almost all 
branches of analysis is Laplace’s: 


VV =0 (7-1) 


Its intuitive meaning was discussed in the chapter on the calculus of varia- 
tion (sec. 6.4), where eq. (6-1) was shown to be equivalent to the postulate 
that V shall have the least mean gradient. The function V satisfying (1) 
may be said to be the “ smoothest ” of all functions. This is obvious when 
Laplace’s equation is solved in one dimension, for then it simply reads: 
d?V /dzx? = Oand has as its solutions all straight lines. 

To indicate briefly the range of application of eq. (1) we state three 
instances in which it occurs: 


a. A fundamental theorem of function theory states: 
Let z = z + ty; then the function f(z) takes the form 
f(z) = ulay) + (zy) 
wherein u, v, x, y are all real; if and only if the functions u and v satisfy: 
Vu = 0, Vv =0 
b. In sec. 4.12 it was shown that the velocity v of an indestructible 


fluid, as a function of space coordinates and the time, must be a solution 
of the equation of continuity, which reads 


op 
_ . = 0 
aty (pv) 


If the fluid is incompressible, its density p is constant, and the equation 
reads 


V-v=0 
If, furthermore, the motion is irrotational, the velocity vector is the gradi- 
ent of a scalar function V, known as the velocity potential: v = —VV, 


and the equation of continuity thus becomes equivalent to Laplace’s: 
v*V = 0. 

c. The electrostatic potential in a region of space not occupied by 
charges satisfies Laplace’s equation. 

Before discussing a partial differential equation of this general form one 
roust realize, of course, that its solutions for different numbers of dimensions 
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(independent variables) are quite different; moreover, that the form of 
the solution even for the same number of dimensions will be different in 
different systems of coordinates. 


7.3. Laplace’s Equation in Two Dimensions.—a. Rectangular Coordi- 
nates. The equation reads: 


oy + = =0 (7-2) 
A method, not of universal applicability but suitable for this particular 
problem, involves the transformation to a new set of independent variables: 
Fexr+y, y=xr-~ ty 
In terms of these, 


a? 3? 3? a? 3 i ee æ 


a® Tap t agn a ay? a at ane 
so that 
8V 
V4V = 4— = 0 
ðtðn 


Clearly, this equation admits both V = f() and V = f(y) as solutions, 
hence 


V = filé) + fam) = fila + ty) + fale — ty) 


where fı and fə are any two functions which are twice differentiable. The 
reader will hardly fail to see the connection between this result and the 
statement above concerning the functions of a complex variable. 

For many problems another form of solution, obtainable by the method 
of separation of variables, is more satisfactory. Let us make the assump- 
tion, justifiable by its success, that V may be written in the form 


Y = X(z)- Y) (7-3) 


where X and Y are functions of only one independent variable, x and y, 
respectively. When (3) is substituted in (2) there results, after division 
by V, ; 

X / y” 

x tyr (7-4) 


an equation in which primes denote differentiation of a function with 
respect to its own variable. If (4) is to have a solution at all, then each 
term on the left must separately be equal to a constant; for a change in z 
would not alter the value of Y’’/Y, and a change in y would not affect 
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X''/X. One may therefore conclude: 


x” y” 

Z = k e = k? 7-5 

54 F (7-5) 
where the constant parameter k?, written in this form for convenience, 
may have any value, real or complex. These are two ordinary equations 
which may easily be solved by the methods of Chapter 2. Eq. (5) leads 
at once to 


A = cet, Y = cget ey 
Hence a solution of (2), characterized by a given value of the parameter k, 


will be 
Vy = cpet* Etn) (7-6) 


Since (2) is a linear equation, a sum of expressions like (6) is also a solution. 
Hence a more general solution is 


V = Lopet Oti) 
k 
or even 


V = fetma (7-6a) 


For the value k = Othe result is of a more special form. Eq. (5) then leads 
to 

X =us+a, Y = by + bg 
so that 

V = azy + cez + dy +e (7-7) 


Which of the solutions, (6), (6a), or (7), is to be chosen depends entirely on 
the nature of the problem at hand. (Cf. examples.) 


b. Polar Coordinates. Laplace’s equation reads: 
yV 1ôəV 1 eV 
a> toa tae 79 (7-8) 
dp p dp p“ op 


Using again the method of separation of variables, we put 
V = P(p) 8(y) 


When this is substituted into (8) there results, after multiplication by 
e/V, 
rI t ti 


e — +e += =0 (7-9) 
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Here the first two terms are independent of ¢, the third is independent of p. 
Hence we may write 
p’ t p’ p” 
24 t ak 2 = ke? 
> +p P S 
The solution of the first equation is at once seen to be P = p**, that of the 
second, 6 = e=*? Hence 
Vi = cpp tT (7-10) 


or, more generally, 
V = Lepp tE" (7-10a) 
k 


For k = 0, (9) becomes 
/ 
PY + Po, p'=0 
P 


When integrated once, the first of these yields P’ = cp}; after another 
integration P = a; In p + aa. On the other hand, $” = 0 leads to 


= bip + be. Hence a particular solution is 
V = (aln p + az) (dip + be) (7-11) 


Again, further information must be available before a special one of these 
results can be selected as a suitable solution of a given problem. 


7.4. Laplace’s Equation in Three Dimensions. ?—a. Rectangular Coordi- 
nates. An application of the method of separation of variables to 
YV av æv 


—— 5 =~ =0 
ax” ay? 8z? 


fellows precisely along the lines of sec. 7.2a. We put V = X(z) ¥W)Z(z) 
and obtain 
x’ y’ / z! / 
xtyt7- 
Each of these terms must separately equal a constant, and the sum of these 
constants (which we write as ki, kg, k2) must vanish. Thus 
Vigkak, = ehiet kay thee k? + k? + kB =Q (7-12) 


Ii ky, ko, or kg is zero, the corresponding factor in (12) must be replaced by 
az + az etc. A more general solution would be 
V= È cane eet (7-12a) 
Kykokg 


2 A solution of Laplace’s equation in three dimensions is often called an “ harmonic ” 
function. 
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In this connection it is sometimes convenient to regard ky, ke, ka formally 
as the components of a vector k. Eq. (12) may then be written 


Viz c(ee", [kl] =0 


b. Cylindrical Coordinates. In accordance with the results of Chap- 
ter 5, 
Y 1əV æ 
ey BU 1 ey 
zZ 


1 8Y 
7 350 79 
Op p dp ae 


p 
Put 
V = P(p)Z(z) ®(v) 

substitute, and divide by V. The result is 

p” 1 p? 1 a” g” 

PtP ŽST 
Clearly, the last term on the left must be constant; let us put it equal to 
—k?. Then 


=0 


Z = cet (7-18) 


The remaining equation, 


p” p’ aa 
Pai + L %2 k =0 
P p P p ® 
when multiplied by p”, separates again into two equations: 
p” 
z7 =}, #@P” + pP’ — (kp? + 1°)P = 0 
The first has the solution 


@ = coe ile 

the second turns into Bessel’s differential equation (2-57) when the sub- 
stitution ikp = x is made, for it then reads: 

ËP dP 

z — 2 P\P =O 

dz? tae dx + ) 
The solution of this equation was discussed in sec. 2.14. It will here be 
denoted by Z;. Collecting these results we have 

Vir = €xZz(ikp ett) (7-14) 


When | = 0, ® = ayo + ay; hence we obtain as another solution of 
lesser generality than (14) the expression: 


Vro = CxoZo (ikp)et™ (arp + a2) (7-14a) 
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When k = 0, Z = (biz + be) instead of the function (13). The equa- 
tion for P takes the form 
PP” + pP’ — PP =0 
which was already encountered in sec. 3b. It has the solution p*’. Hence 
Vor = p* (biz + boje "e (7-14b) 
Finally, when both l and k are zero, the solution may be seen to take the 
form 
Voo = (a; In p + az) (biz + bz) (cip + c2) (7-14c) 
The most general function satisfying Laplace’s equation is a superposition 
of solutions (14)-(14c). 
c. Polar (Spherical) Coordinates. As was shown in the chapter on 
coordinate systems (sec. 5.4), the equation V?V = 0, when transformed to 
polar coordinates, reads: 


1 2f 2Y + 1 2 (sno) + Ae SE no 75 
rdér\ ðr r? sin 6 aa \ 30 r? sin? 6 ag? (7-15) 


Multiplication by 7° sin®é will isolate the term 07V/dy? as the only one 
depending on ¢ from the remainder of the equation. If, therefore, we 
put it equal to ~m? so that 


p = te (7-16) 
(V being written as R(r) : @(6) - &(p)), then eq. (15) takes the form 
eee (ne sned, _, 2 
R © (rr) +228 © (sin 90") — m = 0 


When this is divided through by sin? @ the terms involving r are cleanly 
separated from those involving 6. Hence we obtain 
1l l d m? 
— —— — (sin 90’) — - 
Ə sin 6 dé (si ) sin? 8 


+e =% (7-17) 


=< (PR’) —c =0 (7-18) 


where c denotes the same constant in both equations. It will prove con- 
venient to write this constant in the form c = ¿(l + 1). Let us now make 
the substitution cos @ = xin eq. (17), obtaining (after multiplication by ©) 


2 
a2) 53-22 + [+n -;"a|e =O (7-19) 
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This, however, is none other than the differential equation (cf. eq. 2-41) for 
associated spherical harmonics discussed in sec. 2.11. Special solutions 
were studied in sec. 3.6. They were written in the form 


8 = P(x) 


It must here be noted that these functions do not represent the general 
solution of eq. (19), but a particular one having the property of being finite 
for all values of x between —1 and +1, including these limits. In most 
physical problems this is a condition naturally to be imposed on the solu- 
tion of Laplace’s equation; there are cases, however, in which a more 
general solution of (19) must be chosen. It was also found in sec. 2.11 
that the constant / must, for the sake of finiteness, be a positive integer. 
We shall restrict the present consideration to problems in which these 
conditions hold, and assume 


6 = PP (cos 8) (7-20) 


This expression has no meaning unless m, also, is a positive integer. 
Again, the nature of most physical problems imposes this requirement. 
For if V represents the distribution of any physical quantity in space, it 
must obviously be periodic in » and have a period of 27, since otherwise 
V (p) and V(2r + ¢) would have different values although p and 2r + » 
denote the same angle. But the function (16) does not possess this 
periodicity unless m is an integer. 

The function R is now easily obtained by solving (18) which reads on 
expansion: 


PR” + OR’ —~1L+1)R =0 
Tts solution is obviously of the form R = r*, and on substitution we find 
ala — 1) +2e-id +1) =0 
so that a is either d or — (1+ 1). Hence 
R = qr! + agr (7-21) 
In view of (16), (20) and (21) we conclude that a solution of Laplace’s 
equation in polar coordinates has the form 
Vim = (air! + agr tP" (cos ayer? (7-22) 


and the general solution will be a superposition of any number of such 
funetions. 

Other systems of coordinates in which the equation V?V = 0 can be 
solved by the method of separation of variables are listed in Chapter 5. 
It is felt, however, that the foregoing special cases illustrate the procedure. 


7.6 PARTIAL DIFFERENTIAL EQUATIONS 224 


EXAMPLES OF SOLUTIONS OF LAPLACE’S EQUATION 


7.5. Sphere Moving through an Incompressible Fluid without Vortex 
Formation.—Since the motion of the liquid is irrotational, its velocity, v, 
at every point is the gradient of a scalar potential, V, which satisfies 
Laplace’s equation. Thus 


v=-—VV, and V?V =0 


Which of all the solutions derived above is to be chosen, depends entirely 
on the boundary conditions of the problem. These are, clearly: 

a. The radial velocity of the fluid at the surface of the sphere of radius ro 
shall be equal to the velocity of the sphere times the cosine of the angle 
which r makes with the direction of motion of the sphere. Taking the 
latter as the polar axis, we have 


= Vo COS 6 (a) 


r=ro 


b. The distant portions of the liquid are not affected by the motion of 
the sphere. Hence 

aV 

ar 


=0 (b) 

The form of these conditions at once prescribes the use of polar coordi- 

nates. The solution is, therefore, of the form (22). To satisfy (a) we 

must put the angular part of this expression equal to cos 6; there is no 

dependence on œ at all. The only possible value of m which produces 

freedom from ¢ is zero, and of all the functions P? (cos 8), only P? (cos @) 
isequaltocosé. Hencel = 1. Condition (a) now states: 


â ~2 
— — (ar + ar”) COS @ = vo COS 6 
or r=ro 
whence 


-a + 2a2r5 ° = vo 


But condition (b) cannot be satisfied unless a, = 0. Therefore 
Qagro* = vo. Eq. (22) has thus been reduced to 
3 
Voro 
y = = 
32 cos 6 
This represents the velocity potential for the case in question. 
7.6. Simple Electrostatic Potentials—As a matter of illustration, we 
consider the simplest electrostatic potentials from the point of view of 
Laplace’s equation: that due to a charged conducting sphere, a uniformly 
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. . . . sd infinite 
charged cylinder (wire) of infinite length, and a uniformly charged infini 


lane. - 
P a. The boundary condition in the first case is obviously V (10,8,) =i a 
a constant, provided we write rọ for the radius of the sphere. Since spheri 
cal polar coordinates are used in describing this condition, the BNET 
solution of Laplace’s equation must be taken in the form (22). Phe cons 
dition also prescribes that V shall be independent of @ and y, for otherwise 
V (70,42) could not be constant. Hence mand I must both be zero. We 
conclude, therefore, that V = a; + ae/r, and since a, + as/ra = Va, We 


find on eliminating as: 
Y 
4 = a, Q — 9) -+ 1o79 


If we require in addition that V be zero at r = œ (which would be truc it 
the potential were produced entirely by the charged sphere) the const ant 
a, = 0. 

b. The boundary condition in the case of the eylinder rends 
V looz) = Vo, po denoting the radius of the cylinder. Solution CEE) vs 
now relevant; but the observation that V is independent of g and z lenda 
at once to (14c) with by =c = 0. Hence we have V = ay In p p e 
On eliminating az by means of the boundary condition we find 


V = Vo + a, In 2 
Po 
The constant a; can be determined only when further facts, e.g., the charye 
density on the cylinder, are known. (In fact, a, = — 2A where A is linenr 
charge density.) 

c. in the case of the charged plane we require V(v,y,0) = Wo, suppos 
ing z = 0 to define the plane. This leads at once to a solution of the form 
(12), but with kı = ka = 0; for otherwise V would depend on s and u. 
Since then k must also vanish, 


V = (ajz + a2) (bry + be) (ciz + c2) 
Again, to satisfy the boundary condition, a; = bı = 0, so that 
V = ez + Vo 


The constant cı can be eliminated when the charge density on the plane is 
known. (cı = —4ro if o is surface density of charge.) 

All these results could have been obtained much more simply by apply- 
ing Gauss’ law of electrostatics; our purpose here was to exhibit them as 
solutions of Laplace’s equation. 
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Problem. To find the potential produced when a conducting sphere is placed in an 
originally uniform field of strength Ho, extending along the Z-axis. Use as boundary 
conditions: 

V=Oatr=7o (radius of sphere) 
V = —Eg = — Egr cos 8 atr -> œo 


3 
Ans. V = — Zy cosé [i -— Œ] 


Pir,@) 


Fia. 7-1 


7.7. Conducting Sphere in the Field of a Point Charge.—We wish to 
find the potential at P due to a point charge +q situated at z = a on the 
Z-axis when a conducting earthed sphere, distorting the field, is placed 
with its center at O(cf. Fig. 1). Clearly, 


V(r) = 3 +U 


if U is the potential due to the induced charge on the sphere. Like the first 
term g/s, U must be a solution of Laplace’s equation and may conven- 
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iently be written in the form (22). Here, however, it becomes necessary to 
retain full generality and use a superposition of harmonic functions: 


U = 2 (amr! + bim TEPP (cos jetime 
„m 
From the symmetry of the physical distribution about the Z-axis it is clear 


that U cannot depend on ¢; hence m = 0. Also, since U must vanish at 
T = œ, every m = 0. Hence 


U = Zor NP (cos 8) (7-23) 


The coefficients b; are to be determined by the condition that V shall be 
zero on the surface of the sphere: 


q 
8 


+ Eory P, cos (8) = 0 


on sphere 


The first term on the left can be expanded by means of a theorem proved 
in the discussion of the Legendre polynomials (eq. 3-24 et seq.) 


1 12 fry . 
77 z P, (cos6) if a>r (7—24a) 


Hence the foregoing condition becomes: 


a (ToN —I-i 
pa a z + birg P, (cos 6) = 0 


But this is satisfied only if the coefficient of every P is zero, so that 


b: = —ga tr t 
On substituting this back into (23) we find 
l 
g To To 
=o ee — 6 _~ 
U -a rÊ) P; (cos 8) (7-25) 


a result which permits a very simple and interesting interpretation. Con- 
sider a point, such as a’ (cf. Fig. 1) on the Z-axis. Ifr > a’, the expansion 
of 1/s’ may be seen to be (see derivation leading to eq. 3-27) 
1 l N 
Gat z(*) Pi (cos 6); r>a (7-24b) 
s rit\r 
in contrast to (24a). But (25) is of the same form as this; indeed it 
becomes 
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if we put a’ = 72/a. Our final result may now be written 
, 
=i (7-26) 


provided q’ is identified with (ro/a)g. In words: when a conducting 
(earthed) sphere is placed near a point charge +q it changes the potential 
in the same manner as would a point charge of opposite sign and magnitude 
g’ = (ro/a)q, placed at the point a” = 13/a. The charge gq’ is said to be 
the image of q. 

The same reasoning holds when an earthed plane is placed near a charge. 
For, suppose we put a = rọ + A, a’ = rọ — A’ and let ro go to infinity. 
From a'a = 73 we then get A = A’, and ro/a approaches 1. It is seen 
that the effect of the plane can also be expressed by means of an image 
charge which, in this case, has the same magnitude as the real charge and is 
located at its mirror image. 


Problem. Find the potential of an electric dipole, and of an axialelectric quadrupole. 
| dipole 


quadrupole may be defined as a distribution of charge whose potential, while 


oa: . ao: . P (cos 0) 
vanishing at infinity, is proportional to l Po(cos 0) 


cosh c 
Ans. cy Z (8 cos?3 — 1). 
r T 


7.8. The Wave Equation.—To give a concise definition of a wave in 
physical descriptive terms is not an easy matter; mathematically it is 
defined as the condition of a physical quantity, U, which satisfies the 
differential equation 


Pru -—> = (7-27) 


For a reason which will soon be evident, v is called the phase velocity of the 
wave. In general, v may be a function of space coordinates (wave travel- 
ing in a non-homogeneous medium). When this is true, eq. (27) has an 
enormous variety of solutions, some of which would hardly conform to the 
more intuitive conception commonly attached to the word wave. This 
general case is of special interest in quantum or wave mechanics and in 
certain branches of optics and will be dealt with in Chapter 11. 

In the present section, v will be considered constant, that is, independent 
of space and time. Before examining eq. (27) by the method of separation 
of variables, we discuss a form of solution which is interesting from a 
physical point of view. For it happens that this equation can be solved 
by the introduction of a smgle independent variable 


£ = az + By + yz + vt 
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a, 8, y being constants, provided V? is written in its Cartesian form. On 
substituting this, (27) takes the form 
. U 
[v?(a? + 6? + y*) ~ v7] de 7 0 
which is clearly satisfied if we put 


e+e ty =1 (7-28) 
Subject to this condition, the substitution 

= ar + By + yz — ul 
will also lead to a solution U (n). The functional form of U is left entirely 


arbitrary aside from the requirement that it must permit of two differentia- 
tions. We conclude, therefore, that 


U = fié) + fon) (7-29) 
is a general solution of the wave equation (with constant v). 

Relation (28), however, allows the interpretation of a, £, and y as direc- 
tion cosines,® that is, as components of a unit vector, g. Eg. (29) then 
takes the form 

U =filo -r +t) +fel(o-r— vt) (7-30) 


Now constant values of f,(o- r + vt) are defined by æ -r = —vt; they lie 
on a plane traveling along —o with a velocity v. Constant values of 
fo(o-r— vi) are given by or = vt; they lieon a plane traveling along 
+o with velocity v. The representation (30) therefore describes two 
plane waves traveling in opposite directions with the same speed. 

A solution of equal simplicity may be obtained when (27) is written in 
polar coordinates provided we assume that U is a function of the radius 
vector and ¢ alone. (The solution here derived is therefore fur from 
general.) In that case, V? reduces to 8°/dr? + (2/r)(a/dr), and the 
equation reads 

v? a? (rU) aU 


4 


ro ð’ at? 


0 


The substitution  =r-+ vf, rU = P converts it into v*(d°P/dé) — 
v? (dP /dé) = 0; hence P = fir + vt), A similar result would have been 
achieved by choosing n = r — vt in place of & Hence 


P = filr +o) + falr — vi) 


or 
= = Uhr +t) + falr ~ ot) (7-81) 


3 This interpretation destroys generality; «, B, y need not be real if only they satisfy 
(92) 
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This solution represents two spherical waves, one traveling in toward the 
origin, the other out from the origin. The factor 1/r, without which U 
would not be a solution of (27) and therefore not a wave, accounts for the 
attenuation of a spherical wave as it moves out from its source. 

By suitable choices of f; and Jz a great variety of wave complexes can be 
formed, of which standing waves, defined by the condition U(r,t) = 
F(r) - GE) where F and G represent new functions, are perhaps the simplest. 


Problem. Show that, if fı and fz are both sine functions, written in the customary 
form sin (27/A)( r+ vi), U represents a standing wave. 


We now turn to a more detailed analysis of the wave equation, based on 
the method of separation of variables. On assuming that 


U = ST 


where S is a function of space coordinates and T a function of ¢ only, 
(27) is changed to the form 


the dots denoting time derivatives. Each side of this equation must equal 
the same constant which, for convenience, we shall call —w?. No supposi- 
tion concerning the reality of is here implied, although will turn out 
to be real in the more interesting practical applications. The equation 


vu +T =0 
has the general solution 
To = ce! + eget (7-32) 
The constant w, clearly, has the meaning of an “ angular ” frequency. 
Now the space part of the wave function is defined by the equation 
2 
VS+>S =0 
v 


The constant w/v will henceforth be denoted by k; in terms of the wave 
length M, which is related to w and v by the well known formula 


v 
2r =e 


k = 2x/). It signifies the number of waves of given w per 2r units of 
length and is called the wave number. The equation 


VS+hS =0 (7-83) 
is the basis of the entire theory of vibrations and will be referred to as the 
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space form of the wave equation. 
be devoted to its study. 
7.9. One Dimension.—Eg. (33) reduces to the simple form 


The remainder of the present section will 


which has the solution 
Sk = ae? + be = 


One such solution is obtained for every value of k. For k= 0, 
So = ax + b. It should be noted that 


S= 2S: 
eo k 
is not a solution of (33), but that 
U = LS, Te 
k 


is a solution of (27). (We are writing Tp in place of Ty because k is 
fixed when w is chosen.) Similar caution is required in all subsequent 
considerations. 
7.10. Two Dimensions.—a. Rectangular Coordinates. The work goes 
asin sec. 7.3. In place of eq. 4 we now have 
i ft 
x xt a +k? =0 


Separation is achieved by putting 


x poe 
and requiring that 
B+R 
Hence 
Skike = AY = Cpp ee (7-34) 
b. Polar Coordinates, In place of ( S there results 
Pa E 


2 s+ = 4+ Pk = 0 
On equating #”/# to ai the radial equation becomes 
pP” + pP! + (kp? — m)P = 0 
It is identical with Bessel’s (eq. 2-57) when the independent variable is 
taken to be kp. Hence 
Sem = Zm ikp) E 
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or, more generally, ; 
Sz = Eann (kp)? (7-35) 


7.11. Three Dimensions.—a. Rectangular Coordinates. Immediate 
generalization of eq. (34) shows that 


Skikk, = Cregg Gre bath) (7-36) 


provided that k? + k? + kg = k?. If ky, ke, kg are taken to be real (an 
assumption destroying the generality of the solution) they may be regarded 
as the components of a vector k, and (36) may be written 


S(k) = c(k)e™* (7-37) 


When this result is combined with (82) one sees that a solution of the wave 
equation (27) has the form 


U = Ze (k)ei EHko 
k 


or 
U = Jemeno (7-38) 


The notation* used here, which is rather common in modern physics, is to 
be understood as follows: A function of a vector, such as c(k), is simply to 
be regarded as a function of the three real variables ky, ko, and kg; dk 
is an abbreviation for the product of three differentials: dk,dkodk3. Sum- 
mations and integrations over k are therefore threefold. 

Eq. 38 is a very useful form of the solution of the wave equation. 
Physically, it corresponds to the construction of a general wave by super- 
position of plane sinusoidal waves. It also permits initial conditions to 
be included in the calculation rather easily. For suppose that we know 
the form of the disturbance at £ = 0, Uo(z,y,z). The c(k) are then given 
at once by the Fourier analysis of this function, viz.: 


Uolxy,z) = f o(sye™ Fak 


and (38) represents the wave.at any other time. 


Problema. Show that, in general, 


UGyast) = (2)? f f f f f f Uo (z'y’z' Je TEk doe! dy! da! dkydkedks 


4 This notation is indeed ambiguous. In vector analysis, dk is the element of a 
vector, and hence itself a vector. Here it means the element of volume in k-space, 
which is not a vector. But the convenience of the present notation is so great that we 
shall occasionally employ it when confusion is not likely to arise. 
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If Uo is concentrated at the origin, that is, if Uo is the limit of a function which tends te 


æ at the origin, but in such a way that f Uzvara = 1, 


U (x,y,z) = (2r) f f f e her kab dkadkg 


Problem b. Show that 


(æ) U decreases continually with time at r = 0. 
(8) U is zero wherever |r | > ot. 


then 


Note the physical significance of these results. 


b. Cylindrical Coordinates. The substitutions in sec. 7.4b lead to the 

ordinary differential equations 

Z! = -eZ 

$” = —Pé 

PP” + pb’ — [È — Mo? + PIP = 0 
The last equation has the solution 

P = ZV k? — êp) 

Consequently 


Shel = cet iE Z (V k? _ xp) (7-39) 


If this runction is to be single-valued in p, 2 must be an integer. Con- 
structing a solution of the wave equation wherein the space function has the 
form (39) we thus obtain 

U =E Spa t 
kal 
But it is usually more satisfactory to indicate the nature of the summations 
(Lis integral, k and « may vary continuously) more explicitly. If, further- 
more, we limit Vi2 — 2 to real, positive values (thus again destroying 
generality) and call this quantity », the following useful representation is 
obtained: 


u= faa Ee fewer Jilan (40) 
0 


l= -o 


where we have written c = gu for convenience later, and J7 is the Bessel 
function defined in sec. 2.14. 

The type of problem in which eq. (40) is used is this. Suppose that at 
t = 0, the disturbance is confined to the plane z = 0 where it has the form 
Upp). Also, let the wave be monochromatic (k = const., so that inte- 
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gration over dk is absent). Then 


Uolo) = Ee f ga) Sune udu (7-41) 
l=—@ 6 


and from this relation all coefficients g:(u) can be determined. For if we 
multiply both sides of (41) by e~"? and integrate over p from 0 to 2r, we 
obtain 


2r 


ked 1 oy 
f gu (a)Jv (np)ude = 5 Uolpe)e"? de = Uvr (p) 
0 Teg 


This, however, is nothing other than a Fourier-Bessel transformation® of 
Up (p), and it follows that 


glu) = S Ulo) Jilue)ede 


Problem. Show that the diffraction pattern due to a plane monochromatic wave 
passing through a circular aperture of radius a is given by 


U (0,2) = const. Jf Jolup)d 1 (uae He dy 
o 


c. Spherical (Polar) Coordinates. The equation for S is similar to (15), 
except that the term +%?8S is also present on the left. The substitution 
S = R(r) - (6) - $y) 


now leads to the three equations 


p” = m (7-42a) 
1 d , m? 
T (sin 89^) — ori il+1)e =0 (7-42b) 
l doom 2 +D] _ 
a OR) +| - a |R=o (7-420) 


The second of these is the equation for associated Legendre functions (l 
and m are integers again: m to insure single-valuedness in g, lin order that 
the solution © be a polynomial, i.e., that it should not diverge for 
cos = +1). The third equation may be transformed as follows. Put 
R = P/r, and change the independent variable to i = kr. Eq. 42e then 


takes the form 
&P K+ 2] 


5 For further details, see sec. 8.3. 
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Again, put P = Vt Q, so that the last equation reads 


#0 190 Dg 
tte tl 2 @=0 


This is at once recognized as Bessel’s equation (2-57); hence 
Q = Zant) 


so that 
R= cr Za po(kr) 


For the space part of the wave function, we thus find 


Sk = E Ce.tmPP (cos ye? r Zia) 2 (kr) (7-43) 
m, 


A sum of the form 


l 
E CnP? (cos 8je™? 
m= =] 

with arbitrary coefficients Cm is often called a spherical harmonic and denoted 
by the symbol Y)(8,¢). In using this symbol one must remember that the 
function which it represents is not unique, but contains 2/ + 1 arbitrary 
constants. With this abbreviation, then, 


Sk = Z cr 1Y Or Zs a (kr) 
and the wave function is 
U = E Set = f dk E cr Yill i Zrpilkrje" (7-44) 
k i=0 


7.12. Examples of Solutions of the Wave Equation.—The local pressure 
P in a gas traversed by a sound wave, satisfies the wave equation. 


a. The simplest type of a wave is that emitted by a “ breathing ” 
sphere, i.e., a sphere performing volume oscillations without distortion. 
It is characterized by the two boundary conditions: 


iwt 
; 


(a) Pan, = const. € 
(8) Pro = f(rayer~” 


Condition (a) states that at the surface of the sphere (of radius ro) all 
points shall be in phase; condition (6) implies that at infinity the wave 
shall be an outgoing one. We limit ourselves to monochromatic waves 
(pure tones), so that there is only one value of k or w. Clearly, spherical 
polar coordinates must here be used. Considering then eq. (44), we must 
first omit the integration over k. Since in accordance with condition (a) 
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there must, at r = ro, be no functional dependence either on » or on 4, 
both land m are zero. Hence (44) reduces to 


P = Or YZ (kere 
But the general Bessel fur tion 
Zijalx) = ardia le) + dgd_1/2(z) 
as was shown in Chapter 3. Inserting these, we have 
P=C (a sin kr L az cos Tjee 
r kr 


In order to satisfy condition (8) we put a; = 1, ag = — 1, obtaining 


P= € gilkr—wt) 
r 


as our final result. 


b. When the sphere of the preceding example vibrates, not with spheri- 
cal symmetry, but in such a way that condition (œ) reads 


(æ) P,=,, = const. cos 0e™** 


it is said to emit dipole waves. Condition (8) remains unchanged. Of al] 
the functions composing ¥;(8), only P? (cos @) is a cosine function. 
Therefore Z must be 1. Hence (44) now reduces to 


P = Cr*!?Z39(kr) cos 667 *#! 
But 
r1?2Z 315 (kr) = r fay I 3/2 (kr) + aod _3;9(kr)] 


and this is proportional to 


sin kr _ 60s =] 4a |- sin kr __ 608 z] 
n| E kr 2 kr (kr)? 


If this expression is to satisfy condition (8), it is necessary to choose 


a, = —1, ag = —1 
so that 


7 1, i La 
Zeya (hr) « lis t ar 
and 


1 4 . 
= Cl— OS i(kr—wé) 
P E + | cos Oe 


The constant C may be complex. If it is written C = C, + 1Co, the real 
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part of P, which alone is of interest, will be 


RP = (£ ~ 5) cos 6 cos (kr — wt) — (E 2) cos 0 sin (kr — wt) 
For small values of r, 
RP = — Soe [Cy cos (kr — wt) + Cy sin (kr — ot), 
for large r, 
RP = Ie, cos (kr — wt) — Ca sin (kr — at)] 


v 


If C, is zero, the disturbance is of the form cos (kr — wt) near the sur- 
face of the sphere, but of the sine form at infinity. If C= 0, the 
reverse is true. There occurs, therefore, a curious change of phase as the 
wave moves outward. 

7.13. Equation of Heat Conduction and Diffusion.—The temperature 
U in a homogeneous medium, in which A (x,y,z) calories of heat are gener- 
ated (by some unspecified agency) per unit of volume surrounding the 
point (x,y,z) per second, and which has density p, specific heat s, and ther- 
mal conductivity x, satisfies the partial differential equation 


— =s- y? U + (7-45) 


Various simplifying conditions may arise: In the first place, attention may 
be confined to ‘‘steady states,” that is, to temperature distributions which 
do not change with time. Such states will always occur in physical and 
chemical problems after heat conduction has taken place for a sufficiently 
long time. In that case, ôU /ðtis zero, and the equation reads 


A 
vU = -> (7-46) 


It is of the form of Potsson’s equation which will be discussed in sec. 7.17. 
If, in addition, it is assumed that no heat is generated anywhere within the 
body, A = 0 and (46) becomes identical with Laplace’s equation which 
we have already studied. 

Of greater interest is the situation in which, to be sure, A is taken to be 
zero, but consideration is given to non-steady states. The temperature is 
then subject to the equation 


— y’U — — = (7-47) 
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which is very similar to the wave equation. (This equation is derived by 
yector methods in sec. 4.18.) 

In the kinetic theory, one meets the equation of diffusion which regulates 
the flow of fluid matter within another material medium. It states in its 


basic form: 


2E = y. (DVU) (7-48) 


U represents the concentration of fluid matter, D its coefficient of diffusion. 
Strictly speaking, D is a function of U and hence of (x,y,z). But for small 
concentrations D is found to be very nearly constant. For that case, then, 
(48) may be written 


a 
DV?U — “ =0 (7-49) 


All parameters appearing in (49) as well as in (47) are positive, hence 

both of these equations will be written in the form 
av?U — 2U =0 (7-50) 
at 

and we remember that, for heat conduction, U = temperature and 
a? = x/es, while for diffusion, U = concentration and a? = D. The 
remainder of this seetion is devoted to the solutions of eq. (50). 

Separation may at once be achieved by putting U = S(z,y,z) - T(t), 
and it is found that a’V7S/S = T/T. On equating the right-hand side to 
—ak*, k being an arbitrary constant, it is seen that 


Te = const. «7 (7-51) 
while S must satisfy 
VS +S =0 (7-52) 
an equation identical with the space form of the wave equation, (33). H, 
therefore, we combine the solutions of (83), discussed in the preceding 
section, with T, in the form, (51), we have an answer to the problems of 
heat conduction and diffusion. 
7.14. Example: Linear Flow of Heat.—Suppose that heat flows in a 
linear filament placed along the X-axis. The solution of (52) is then 
Sk = cpe™ + dpe ™ 
and this may be taken as a 


Sz = Cpe 


if we assign both positive and negative values to k. The general solution 
reads: 


U = EST = f (kes ek ta (7-53) 
k 


—o& 
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Every choice for c(k) will satisfy eq. (47), but the proper selection is to 
be made in accordance with initial conditions, Let us suppose, then, that 
U = Up(x) at t = 0. Eq. (53) now states: 


Uo(2) = S e(k)e***dk 


oo 


and c(k) may be vbtained from this by means of a Fourier transforma- 
tion. In view of eq. (8-13)' 


e(k) = — S Uo(a" ea 


so that :53) becomes 
1 ° 2 i , 2K? 
U(x, t) = =f f Ulz’ jetez \-arkt yo dk 
T -0 C] 


The integration with respect to k can be performed: 


f g Ceri(em2 kgh J3 g (ena )3/4att 
a 


_— o 


whence 


U(z,t) = -= S Uo (x emai! (7-54) 
2a rt 
Problems. 
a. Prove that (54) reduces to Ugle) for ¿t = 0. 
b. Show that, if Uo(z) is a step function such that 
tip [al <1 
Uo ~{j if |x| >1 


1 Laz l +z 
Un i Cx) + (i) 


c. Show that, if 
1 for z>0 z 
Ug = th U =$3 1 —— 
° k for zgo, 9 d AE) 
Interpret the last two problems from the point of view of diffusion. 
d. Suppose Us is a “ function ” which is everywhere zero except at z = 0, where it 
tends to œ in such a way that f Us(z}dz = 1. (Such a “ function” was introduced 


by Dirac and is commonly known to physicists as a d-function. Strictly speaking it is 
no function at all.) Then, clearly, 
U = i g zti4a% 
2aV xt 
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Discuss the temperature at any point x, and show in particular that it will rise to a 
maximum at i = z?/2a?. This fact affords a simple experimental determination of a, 
and hence of D and the thermal quantities. 


7.15. Two-Dimensional Flow of Heat.—In polar coordinates, S is given 
by eq. (85) Hence 


U= fo DEAZ) yee" (7-55) 


If, as we shall suppose, the temperature distribution at ¿£ = 0 is radially 
symmetrical, so that U does not depend on g, the only value permitted to m 
is zero. Also, since Zp is an even function, the integration in (55) may be 
taken from 0 to ~ without error. For Zp we shall take the /o-function, 
because it will at once be seen that most temperature distributions can be 
expressed in terms of Jo alone. Thus 


U = S 7 oE) Jolko) dk (7-56) 


Let us write 
c(k) = kg(k) 


and suppose that U = Uo(p) att = 0. It is then easy to determine g(x) 
formally and hence U(p,t). For in accordance with (56) 


Jo(p) -f g(k)Jolko)kdk 
0 


in other words, g(k) is the Fourier-Bessel transform of Ug(p). (Cf. Sec. 8.3.) 
Hence 


atk) = f Uo Jolka 
0 
When this is put back into (56) the final form of U(p,t) is obtained: 
U(p,t) = f f Uolo) Jolko’) Jolko) kp dkdp" (7-57) 
0 
Problem. Show that, if Uo(o) is concentrated at p = 0, and f Uole)edp = 1, 
ò 

l 2 

U (p,t) = 302! eeta t 


Compare this with problem (d) of Sec. 7.14. Interpret above as a diffusion problem. 


7.16. Heat Flow in Three Dimensions.—In rectangular coordinates, U 
is given as a generalization of eq. (53) (ef. eq. (36) for the form of S): 


v= J J J c(krkakg ete tbat fee) “atk HEED Ok dkodkg 
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or, with the use of the vector notation previously explained (cf. footnote 


on p. 227) 
U = f Jf f (ket oH gie (7-58) 


We now repeat essentially the procedure leading from (53) to (54), but 
using three variables instead of one. 


Uole,y,2) = f f f c(i) edie 


hence c(k) is the Fourier transform of Up: 


i 2 
eto = 5 fff vole’ ata 
8r -x 
whence 


1 ° He (ee) ak? 
U(z,y,2,6) = SSSI Up al ya! jet r’) ede! de 
= (2a V rty? f f f Uo lay! yz! eT ET gr (7-59) 


If Up is a function of x’ alone, the integration over y’ and z’ may be per- 
formed, and the result is identical with (54), as it should be. Of greatest 
practical importance is the case where Uy is a function of r’ alone. The 
volume element dr’ may then be written in polar form: r’2dr’ sin 6’dé’dy’, 
and the integration over 6’ and »’ can be performed. Itis to be observed 
in this connection that 
G — rY = r? +r’? — Orr’ cos 8 

One then finds 


U (ri) = (2arv at) f Uo EOE gieta Viar 
0 


Problem. Show that, if 


where 
l&r 2 z 
uH wo fire 
* avi’ Ved, ~” 


7.17. Poisson’s Equation.—All partial differential equations treated 
thus far in the present chapter are linear and homogenous in the depen- 
dent variables (cf. footnote in sec. 2.5). It is only for this type of equation 
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that the method of separation of variables may work. The variety of 
linear and inhomogeneous equations of importance in scientific analysis 
is also great, but there exists for their solution no method nearly so 
powerful as the separation of variables. 

An equation like (83), the space form of the wave equation, would 
become inhomogencous if the right-hand side were not zero but some 
function f(x,y,z). One remarkable feature of an inhomogencous equation, 
which will here only be mentioned, is that it may not possess solutions for 
every value of k even though the homogeneous equation, with the same 
boundary condition, has solutions. The inhomogencity selects, as it were, 
special values of the parameter & for which solutions are possible. This 
phenomenon, which is the rule for inhomogeneous equations, may also 
occur for homogeneous ones if the boundary or initial conditions of the 
problem are sufficiently stringent. It will be discussed under the heading 
“ characteristic values ” or “ eigenvalues ” in Chapter 8. 

An inhomogeneous equation which is rather common is Poissen’s; 
it will here be chosen to illustrate a process of solution. Its general form is: 

Ved = J (%,y,2) (7-60) 
One encounters it (1) in electrostatics, where @ is the ordinary potential 
and f represents a constant times the distribution of charge,® p(z,y,z), the 
constant depending on the units chosen: (2) in the theory of heat flow, 
where eq. (45) takes the form (60) when @U//at = 0, as shown in eq. (46). 

To solve (60) we first recall Green’s theorem (see sec. 4.19) which 
states that, for any two functions of space coordinates, u and v which are 
finite, continuous and have continuous first and second derivatives, 


f (uV?v — oV?u)dr = f (uẸv — vyu) - do (7-61) 


Here r represents a certain closed volume and o its surface; do is taken 
positive in the direction outward from the volume. In our problem we are 
given the function f(z,y,z) and we wish to find $(x’y’z’) for a fixed point of 
observation (2’y’z’). In the following it is necessary to distinguish 
between this fixed point, which will be denoted by primes, and the variable 
point (xyz) over which integrations are to be performed. 

It will prove convenient to consider, in connection with theorem (61), a 
volume 7 such as that depicted in Fig. 2. It is bounded by the outer sur- 
face es and the inner surface o1, a spherical cavity of radius sọ about the 
fixed point P’. The function u will be specified to be 


__! il 
“"lrorl>s 


SIf p = 0 the equation reduces to Laplace’s as pointed out in sec. 7.2. 
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it satisfies Laplace’s equation V2u = 0, as may readily be verified. Then 


eq. (61) reads: 
Vv Av 1 
f ; di = f p — vy Əl -de (7-62) 


T7 T 


If now we interpret v as , we may replace Vv by f in accordance with (60). 
The right-hand side of (62) consists of two integrations, one over do, and the 


other over dey. Consider first that over dg. Clearly, ¥® - do, approaches 
—d6/dr|p-do; as 8 tends to zero, the minus sign coming from the fact 
that do, is inward with respect to the cavity. Hence 


f oe 
—— y — 
s 8 


Arse 
-—2-50 as 9-30 


or P’ So 


provided # has a finite derivative. The second integral on the right of 
(62), when taken over cı becomes, in the limit as so — 0, 


\ | l 2 | 
~ fav G) a> -— e| (- >) (—4rso) = —4re 
a & P' 89 ' 


P 
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Hence, if it is assumed that so — 0, eq. (62) reduces to 


fla = Arla yg’) +f [S - av (Z o). do (7-63) 


Here the remaining integral over øs on the right has a rather simple mean- 
ing. Itisa solution of Laplace’s equation in the form: V/?4(2’,y’,2’) = 0, 
for the only quantities which depend on the primed coordinates are 1/s 
and ¥(i/s), and these clearly satisfy it. Hence if this whole integral were 
subtracted from ®, the remainder would still satisfy eq. (60). It is indeed 


rz 


represents the contribution to © coming from those parts of f(2,y,z) which 


lie outside of r. In the electrostatic case, f represents the potentiali due 
aR 
to the charge outside of the volume r considered. 


The integral over oz may be eliminated in another way. Suppose we 
allow r to become infinite and impose on ® the boundary condition that, at 
infinity, it vanish at least as strongly as 1/r. Then V®/sand @V(1/s) are 
both of order 1/r? at œ, and after the surface integration, which amounts 
to multiplication by r?, the result will still be of the order 1/r and hence 
vanish, 

Of interest, therefore, is chiefly the particular solution which remains 
when the integral over oz in (63) is omitted; it is usually referred to as the 
solution of Poisson’s equation. Thus 


-> i G a dxdyde (7-64) 


B(2' y’ 2") = 


Problem. Show that, when f(z,y,z) is different from zero only within a finite volume 


ro such that S f@y,z)dr = g, then for any point (z'y’z’) far removed from ro, 


aja 


1 
B(x’ iy! 1% $) = = r r 
the origin being chosen inside ro. Interpret this result in electrostatics. 
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CHAPTER 8 
EIGENVALUES AND EIGENFUNCTIONS 


8.1. Simple Examples of Eigenvalue Problems.—It frequently happens 
in mathematical analysis that a given equation, or a set of equations, 
yields solutions which are in general uninteresting or trivial, except when 
a certain parameter appearing in the equations is given a definite value. 
Such circumstances give rise to eigenvalues! and eigenfunctions.’ Their 
occurrence is so common that it often goes unrecognized. For illustration, 
let us take a very simple (and useless) example. 

Suppose one wishes to solve the two simultaneous equations 


G — je + 2y = 0, 2z -+ A—rAjy =0 


To be sure, they always possess solutions; but they are almost always 
x = 0,y.= 0. Only for two values of the parameter A will this not be true: 
forà = 3thesolutionisz = y; forà = —litisr = —y. (The numerical 
values of z or y are of course never fixed by the linear homogeneous equa- 
tions above.) The two values of à for which the equations possess non- 
vanishing solutions are said to be eigenvalues; the two corresponding solu- 
tions are called eigenfunctions. 

Eigenvalues are not always denumerable and discrete, as in the fore- 
going example. To show this, we choose an even more trivial illustration. 
The equation 


always possesses a solution. If, however, we wish a real solution, à is at 
once limited to the domain of positive numbers. Hence we may properly 
say that z? =) is an equation leading to eigenvalues: à 2 0 and corre- 
sponding eigenfunctions x = Vi. 

In both examples eigenvalues were called into being by the imposition of 
special conditions: in the first that the solutions shall not vanish every- 
where; in the second that the solution shall be real. This is generally true; 
eigenvalues are always produced by special requirements placed upon the 
solutions of equations. In the most interesting cases of physics and chem- 
istry, these equations are differentia] or integral equations, and the con- 


t The terms eigenvalue and eigenfunction, because of their brevity, appear to be 
rapidly displacing their classical synonyms: characteristic value and characteristic func- 
tion, at least in the physical and chemical literature. 


ira E -i 
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ditions are boundary conditions. We now turn to some cases of greater 
scientific Interest. 

8.2. Vibrating String; Fourier Analysis.—In classical physics, many 
eigenvalue problems occur in connection with vibrating systems. The 
simplest of these is the problern of a vibrating string. Consider the string 
to extend along the X-axis, to be fastened with its left end at the origin and 
its right end atz =l. From elementary physies we recall that if its mass 
per unit length is m and its tension T, the speed of waves along the string 
is given by v = VT/m. The wave equation, discussed in sec. 7.8, will 
then read 

a ae? (64) 
U is the vertical displacement of the points along z. 

We restrict our attention for the moment to types of vibration having a 

single frequency v, or angular frequency w = 2mv, so that U = S(x)e**‘ or, 


sin wt 
if we care to limit the analysis to real functions, S(z){ or Į- The 
COS wt 
function S will then satisfy the ordinary differential equation 
d'S a 
er + AS = 0 (8-2) 
dz“ 


where, in conformity with the usage of Chapter 7, the abbreviation 


w 2r 
k =- o> 8- 
v A (8-3) 
has been used. Here stands again for the wave length of the disturbance 
produced. The general solution of (2) is, clearly, 


S = Asin (kz + ô) (8-4) 


where A and 6 are arbitrary constants. Every solution of the form (4) is 
perfectly acceptable as far as the differential equation is concerned, but it 
does not describe the behavior of the strmg. Solution (4) permits the ends 
of the string to vibrate, whereas the physical condition requires them to be 
fixed. It is therefore necessary to impose the following boundary condi- 
tions upon the solutions (4): 


(a) S(0) =0 
(b) SQ) =0 


Both can of course be satisfied by putting A = 0, but this would lead to 
the unwanted solution U = 0 everywhere, Hence there is only the second 
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arbitrary constant, 6, left for adjustment. It must be taken to be zero in 
order to satisfy condition (a). (Choice of m, 2r, ete., leads to the same 
final result.) But the function S = A sin kr will not obey (b). Thus the 
problem can be solved only if we are willing to tamper with k: we are led to 
eigenvalues. If sin kl is to be zero, k must be 0, or m/l, 2r/l -+ -nr/l. The 
value 0, however, is excluded for the same reason that A = 0 was rejected. 
To each eigenvalue of k = nr/l (n integral), there corresponds an 
eigenfunction San = A, sin nwz/l. These eigenfunctions are of course 
undetermined with respect to the constant multipliers, A,, which may be 
chosen at will, and which may be different for every n. 

Since k is related to à by eq. (3), there is thus generated a corresponding 
set of eigenvalues for à, namely à = 2l/n, n integral. This is the well 
known equation for the wave lengths of standing waves supported by a 
vibrating string. In the simplest mode of vibration, corresponding to the 
fundamental frequency, \ = 22, the string has nodes only at the end points. 
For the first harmonic, \ = 1, there is in addition a node at the center of the 
string, and so on. In general, the number of nodes is n + 1. 

The eigenfunctions under consideration have two important properties 
which, as we shall see in sec. 8.5 et seq., are common to a large class of 
eigenfunctions arising in connection with different problems. They are 
(1) orthogonality, (2) completeness. To explain the meaning of these terms, 
let us arrange the eigenvalues of eq. (2) in a definite order, kp = nr/l, 
n= 1, 2,3---; and write again Sn = A, sin nvz/l. Orthogonality 
means: 


nt 
J Sn()Sm(2)dz = Cnônm (8-5) 
0 


The word comes originally from vector analysis (cf. Chapter 4) where two 
vectors, A and B, are said to be orthogonal if A -B = A,B, + A,B, + 
A,B, = 0. Similarly, vectors in N dimensions having components As, B; 


N 
(i = 1, 2, - - - N) are said to be orthogonal when È, A;B; = 0. If now we 
i=] 


imagine a vector space of an infinite number of dimensions, in which the 
components A; aud B; become continuousiy distributed and everywhere 
dense, 7 is no longer a denumerable index but a continuous variable (x) 


and the scalar product 2 A,B; turns into f A(z)B(x)dz. If it is zero, the 
functions A and B are said to be orthogonal, and this is the sense in which 


the word is used above. 


The idea of orthogonality is indefinite unless reference is made to a 
specific range of integration, which in the present case is from 0 tol. Here 
the validity of eq. (5) is at once verified on substitution of the S-functions, 
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and the constant cn is seen to be 
a [ont 2 tf", l 
Ag f sin? — rdr = AŽ . — sin? udu = = A? 
0 l nr g 2 
For many purposes it is convenient to have e, equal to unity. This 
can always be achieved by a suitable choice of An. In the present case, 
everye, = lif An, = V 2/1. If, therefore, we write Sa = V 2/1 sin nrz/l, 
the orthogonality relation (5) reads 


l 
J 54S E)d2 = bum 6-5") 
0 


When the constants A, are thus chosen the eigenfunctions are said to be 
normalized; functions satisfying (5’) will henceforth be termed ortho- 
normal. Itis clear that a set of functions having the property of orthogon- 
ality (expressed by eq. 5) can always be made ortho-normal by a proper 
choice of multiplicative constants. 

A simple modification in the idea of orthogonality is to be made when 
complex functions are considered. For these, condition (5) must be re- 
placed by 


f Si (£) Sm(£)dt = Crbam (8-5*) 


where S* represents the complex conjugate of S. This definition will be 
used in later work. 

We turn to the second property, that of completeness. A set of functions 
is said to be complete if an arbitrary function, f(x), satisfying the same 
boundary conditions as the functions of the set, can be expanded as follows: 


f(x) = EanSn(2) (8-6) 


the a, being constant coefficients. 

In the present instance, eq. (6) is equivalent to the theorem of Fourier 
which states, in its simplest form, that a function f(z) which vanishes both 
atx = Oand atx = r (and has but a finite number of finite discontinuities) 
may always be written? 


w 


Jæ) = È an sin nz (8-7a) 


2See, for instance, Byerly, W. E., ‘Fourier Series and Spherical Harmonics,” 
Ginn and Co., 1893, p. 38. The formulas 7a, b are special cases of eq. 42, developed 
later in this chapter. Note also the more precise definition of completeness given in 
sec. 8.8. 
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the coefficients being given by 
2 £7 . 
an =- f JE) sin nédt (8-7b) 
Gi (6) 
Eqs. (7) may be modified by using, in place of x and £, the variables rz/l 


and ré/l. This has the effect of changing the range of z from (0,7) to 
(0,1), and the results are 


fæ) = La, sin (= +) (8-8a) 
i 
dn = - J FE) sin (= :) dt (8-8b) 


If S is taken in its normalized form, these equations read simply 
D l 
f(z) = Lane), an= f SORO 
n= 0 


The fact of completeness has an important bearing on the problem of the 
vibrating string which we originally set out to solve. While it is true that 
only a particular S,,, for which k assumes a specific eigenvalue, is a solution 
of eq. (2) [the series (6) would not be a solution of (2)!], the value of k is 
not prescribed by eq. (1). Hence eq. (1) is satisfied by 


U = ZenSn(X) COS wal, won = Ky 


with arbitrary coefficients c,. This, then, is the most general solution of 
the string problem. It reduces to a series like (6) for ¿ = 0, a series which 
can be chosen to represent any function f(z) which vanishes at the end 
points. Hence it is seen that any initial configuration of the string will 
yield a solution of eq. (1), that is, a (standing) wave. 

Fourier analysis is so useful a tool in applied mathematics that it seems 
well here to digress for a moment and summarize its essential features 
beyond the needs of the present problem. Details maybe found in the 
book by Byerly already mentioned. The general theory, including proofs 
for the statements here made, will be found in Secs. 8.5-8.8. A function 
f(z) defined between z = 0 and x = z may also be expanded as a cosine 
series: 


f(z) = $bo + È bn cos nx (8-9a) 
n=l 
where 


ba = f FE cos neat (8-9) 
TYO 
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except that the series may not yield the same values as f(x) at discontinui- 
ties and at the end points. Otherwise the developments (7) and (8) are 
equivalent. There is, however, an interesting difference in the values of the 
two series when they are extrapolated to the range (—7,0). Here the series 
(7) changes sign in such a way that f(—z) = —f(zx), while series (9) yields 
f(—2z) = f(z), as is evident from the fact that sin z is an odd, cos z an even 
function. Thus, if it is desired to expand a function between — vr and +7, 
series (7) can be used only if the function is odd, series (9) when it is even. 
Now any function can be represented as the sum of an even and an odd one. 
Hence, if an arbitrary function f(x) is to be developed between — vr and r, 
both cosine and sine series must be used. It is evident, therefore, that in 
this more general case 


F(x) = La, sin ne + zbo + Lida cos na (8-10a) 
ne n=1 
where 


an == f SO sinngdt; ba =* f SO cos eae (8-100) 


The coefficients in front of the integrals are most easily checked as follows: 
Multiply (10a) by sm mz and integrate over z between — r and m. Because 
of the mutual orthogonality of the functions sin nz, sin mz, cos nz, for n # 
m the relations (10b) are at once apparent. 

If f(x) is defined, not in the range (— r,r), but in (—1,1), a simple change 
of variable from z, £ to (7/l)x, (7/1) in eq. (10) will produce the required 
modification. The result is 


f(x) = Xe, sina + 4b) + Xb, cos ST a 
i l nr 1 f nr (8-11) 
agf JOm E nog) JO co T sat 


This may be expressed more simply in complex form. For if the sine and 
cosine functions are written in their exponential form, the reader will verify 
without difficulty? that 


w . 1 4 . 
_ ènrrjt = —innẸjl 
F(z) = p> Cae TE, Cn aS so dë (8-12) 


The coefficients ¢, in this expansion are complex. 
When the series f(x) as given by (12) is extrapolated beyond the range 
(—UD the function f(z) is repeated periodically in every interval between 


3 Note that e"**/" and e’™**/? are orthogonal functions in the sense of eq. 5*, the 
range of integration being (—1,1). 
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(2n + ijl and (2n + 3)l. Hence formula (12) permits representation of 
periodic functions only. One may wonder, therefore, whether it is possible 
to perform a Fourier analysis of a non-periodic function, defined in the 
range of the entire real axis. Highly technical considerations for which the 
reader is referred to more specific treatises* affirm this possibility, provided 
the function, f(r), to be expanded is piecewise continuous and such that the 


integral f \f(z)| dz exists. 1n that case 


w 


flz) = f c(k)e™*ak 
TEFIE O (813) 


These equations may be written more symmetrically by putting c(k) = 
(1/V 2r)g(k). They then become 


-l > the 
i@) == f dk 


i a 
k) = —— 
a(k) Vrd -a 


Two functions f and g related by eq. (13) are called a pair of Fourier trans- 
forms; i.e., g is the Fourier transform of f and vice versa. Such pairs are of 
great importance in the analysis of electrical impulses and in quantum 
mechanics, where they effect the transformation from coordinate to me- 
mentum space. 


(8-13) 


(ged 


Problems. 
a. Show that the Fourier transform of f(z) = e"? is g(k) = e*"2 (This fact is 
occasionally expressed by saying: the error function ¢~*"/ is its own Fourier transform.) 


b. Show that the F.T. of the step function f(z) = W/2r/2l if |z| <l and vanish- 
ing if |z| >l is g(k) = sin kl/kl. Note: as i approaches zero, f(z) becomes œ at 
z=0. It is then called a “ unit impulse ” function, or a 6-function. Its transform 
g{k) = 1, 

* E.g., Titchmarsh, E. C., “ Introduction to the Theory of Fourier Integrals,” Oxford 
University Press, 1937. 

5 For further considerations see v. Kármán, T., and Biot, M., “ Mathematical 
Methods in Engineering,” McGraw-Hill Book Co., 1940. An extensive list of Fourier 
transforms has been compiled by Campbell, G. A., and Foster, R. M., “ Fourier Integrals 
for Practical Applications,” Bell Tel. Syst. Tech. Pub. Monograph B-584, 1931. See 
also Magnus, W., and Oberhettinger, F., “ Formulas and Theorems for the Special 
Functions of Mathematical Physics,” Chelsea Publishing Go., New York, 1949. 
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RE cos kozif |s| <l. sin [Cho — kyi] 
2 is — = 


e. Show that the F.T. of f(z) = o 
= 


0 if}2| >z 


The Fourter Integral Theorem may be deduced immediately from (13). 
On putting g(k) into the mee for f(x), there results 


ie) == fS S 10da (8-14) 


When f(z) is real, the imaginary part of e*-® may clearly be neglected, 
and the Fourier integral theorem takes the more customary form: 


He) = x fat f IO cos ke Dae (u) 


Finally one may derive from (14) a result sometimes called the Dirichlet 
integral. On performing the integration not between infinite limits but 
between —A and A, and then passing to the limit 4 — œ we find 


fee) =i S yo SESE ae (8-15) 


As a special form of 15) we note: 


FO) = + lim S to sin AT iy 
T Ame on -o x 


in [A(z — | . 
The expression 1 lim sin [4 (@ — §)] or ett is called the 


T Ame co z-—€ 2r 
Dirac 6-function and denoted by ô(z,£). Eq. (15) may therefore be written 


je) = [JOD (8-15’) 


All the foregoing results can be generalized® to permit expansion of func- 
tions of several variables, provided they satisfy the condition 


feve o -) | dxzdydz eae 


exists. For instance, in place of (12) we have 


fey= È Cm net TID netay) 
H — o . 6-16) 
Can = al S AEA didn 


See Courant, R., “ Vorlesungen über Differential- und Integralrechnung,” Vol. 1, 
Second Edition, p. 373. 


8.3 EIGENVALUES AND EIGENFUNCTIONS 254 


and in place of (13), 


fey) -= f f gika )ei E= Ady | 
T — a 


1 > (8-17) 
glkike) = sc ff emer” dec 


8.3. Vibrating Circular Membrane; Fourier-Bessel Transforms.—The 
mathematical description of the vibrating membrane also leads to an inter- 
esting eigenvalue problem. The wave equation, when written in polar 
coordinates, was shown in sec. 7.10 (ef. eq. 7-35) to have the solution 


U=S8-T, Sum = Zm(kpje*? (8-18) 


The fact, pointed out before, that m must here be an integer to insure the 
function to be physically meaningful (e**”* must be the same as e**™(?T?”) 
because ¢ and ¢ + 2r denote the same angle in the problem of the mem- 
brane) may also be expressed by saying: the eigenvalues of m in the dif- 
ferential equation &’’ = —m?® are all integers. Note that the corre- 
sponding eigenfunctions, e*””, are orthogonal and form a complete set, the 
range being (0,27). But we wish here to discuss another, less simple 
eigenvalue problem. 

Consider modes of vibration of the membrane which have circular 
symmetry. This limits m to the value zero, and (18) becomes 


Sp = Zo(kp) (8-19) 


We now impose the boundary condition: U = 0 at all times at the periph- 
ery of the membrane, corresponding to the physical condition of having the 
edge fixed. If the radius of the membrane is a, this means 


Zol(ka) = 0 (8-20) 


The function Zp is a linear combination of the Bessel functions Jo and No, 
a Bessel function of the second kind which is linearly independent of Jo 
(sometimes called a Neumann function). But the latter may be shown to 
be infinite at p = 0 and must therefore be excluded. The Zo in (19) and 
(20) must therefore be interpreted as Jp. To satisfy (20) the parameter k 
must be so adjusted as to make ka a root of Jo, and since Jo has an infinite 
number of roots,’ the eigenvalues of k will form an infinite set k; = z,/a, 
where z; is the i-th root of Jo(z). The corresponding eigenfunctions 
are Jo (kip). 


17 The values of the roots of Jo(z) are listed in books on Bessel functions. See also 
Jahnke, E., and Emde, F., “ Tables of Functions,” Dover Publications, New York, 1943. 


255 VIBRATING CIRCULAR MEMBRANE; FOURIER-BESSEL TRANSFORMS 8.3 


Are these functions orthogonal? It is not difficult to show that 
S Jolkip)Jolkzp)dp is different from zero (an inspection of the graph of 
0 


the integrand will convince the reader). Thus it seems that eq. (5) fails 
in this example. But we have overlooked an important feature: the ele- 


ment of area of the circular membrane is not dp, but 2rpdp. And now it 
will be found that 


f Jolknp)Jolkno)odo = crôn.n (8-21) 


As the present problem shows, specification of a range of integration is 
not sufficient in defining orthogonality of functions; it is also necessary to 
state the weighting factor associated with each differential range of the co- 
ordinate. In the problem of the vibrating string, the weighting factor w(x) 
happened to be unity; here it isw(p) = p. In the next example it will be 
seen to be p”. The same w which appears in the orthogonality relation will 
also occur in the integrals defining expansion coefficients (cf. eq. 42). 

To prove eq. (21) for m = n we use the last of the formulas in sec. 3.9, 
according to which the left-hand side has the value 0 because both Jo(kya) 
and Jo(kea) vanish. According to another formula in this list, 


a 


2 
J [ol(knp)Fodp = — E J (na) Js (kna) 


But in view of eq. (3-69), J_1 = — Jı, so that the constant cn in (21) has 
the value (a?/2) [J1 (kra). 


The question of the completeness of the functions Jo(kno), i.e., the pos- 
sibility of the expansion 


fo) = È andolknp) (8-22) 


will be investigated in sec. 8. We shall here anticipate completeness pro- 
vided, of course, that f(p) vanishes also at p = a. Granting this, the co- 
efficients a,, may be computed in the manner already illustrated in connec- 
tion with Fourier series: 

Multiply both sides of (22) by Jo(kmo)ede and integrate. The result 
is, again in view of (21), 


a 2 
f 10) Folkmp odo = am: lino? 
0 


If we use the normalized function Sn = (VJJ 1(kna)}* Jolkap), the 
expansion reads 


Jo) = E aySn(0) 
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and the coefficients are 
On = J EO Sa(o)odp 
0 


The problem of the circular membrane has been simplified by our 
assumption of circular symmetry. One may wonder what happens if 
types of vibrations are permitted in which the displacement is a function of 
both p and g, for these certainly occur. Itis then necessary to use the func- 
tion Sim defined in eq. (18). These may easily be seen to be orthogonal 
with respect to both indices, i.e., 


a Qn 
J odo f Im lno) m ned = CBr mba 


Moreover it is possible to expand 
fee) = Leeamdm (Kaper? 


The details of this development may be left as an exercise to the interested 
reader; they are worked out fully in some works on sound.® 

The condition f(a) = 0, upon which the expansion (22) was based, 
may be removed; the range of integration must then be extended from 0 to 
œ. Now it is clear that, asa— œ, the values k, move closer and closer 
together. In the limit they will, in fact, form a continuum. When the 
passage to this limit is performed, eq. (22) becomes what is known as a 
Fourier- Bessel integral, an equation which is useful in the theory of radia- 
tion. While the transition to the limit is difficult, the result may be 
obtained quite simply by a method used by Stratton,!° which will here be 
given. — 

Suppose f(x,y) can be expanded according to eq. (17). In these 
equations, we transform the variables of integrations to polar form: 


r=pcosg, y=psing; kı =kcosa, ke = ksinag 
They then read 
1 æ Qn f 
flow) =— f kdk f glk aje re-a gy (8-23a) 
2r 0 0 


1 o 2r . 
gha) = 5 J odp J Flop) eee g (8-23b) 


8 See particularly Morse, P. M., “ Vibration and Sound,” McGraw-Hill Book Co., 
1936, p. 153 et seq. 

3 We are here following a terminology which seems to be gaining ground, although we 
have been unable to discover its origin. It appears that relations of the form (24) were 
Grst discovered by Hankel. 

10 Stratton, J. A., “ Electromagnetic Theory,” McGraw-Hill Book Co., 1941. 
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We now take for f (p,p) the special function f(p)e'™*. The integration over 


ọ appearing in (23b) may then be performed with the use of ea. (8-72a), 
aceording to which 


2r 2r 
f gilme—kecos(P—a) dy = f giime—kosin (Pet ar/ D] gy 
o 0 


Qe w 
= eimlarn!2) , J etime kesin] Jo a Qetm(a-z/2) f cos [m0 — kp sin 6]d6 
0 


= Sqeimla—w/2) ` Jmlko) 


Thus 
g(k,a) = J f(e)Im(kp)pdp - TTD = g(k) . emd (8-24b) 
On putting this answer into (23a) we find 


, 1 w% žr ; 
soem? = O f eimernimtbemncen agg 
TSO 0 


1 ° . 
= zS g(k)kdk + 2re’™? Jm(kp) (8-24a) 
2T 9 
These results may be expressed in the symmetrical form 
Flo) = f glk) Im (kp edhe (8-24a) 
o 
o) = f FOn (kodo (8-24b) 


The functions f and g satisfying relations (24a, b) are said to be a pair of 
Fourter-Bessel transforms. It is to be noted that the expansion (24a) of 
the function f(o) holds for every value of the integer m. Eq. (22), there- 
fore, is a special case of a Fourier-Bessel expansion. 


Problema. Show, using the formulas of sec. 3.9, that the Fourier-Bessel transform of 
f(p) = p’, with respect to Jn, is 


rp = tr + 1) ere 
2 
r C — 2) 
2 


Problem b. Verify the identity 
JE = f f J (2) Fm kp) Im (kt) pkdpdk 
0 0 


gtk) = 
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8.4. Vibrating Sphere with Fixed Surface.—The problem of a sphere 
vibrating with a node at its surface is of little interest in acoustics, for if 
there is never any displacement at the surface, the sphere cannot radiate. 
However, the same problem, interpreted quantum-mechanically, describes 
the motion of a particle within a spherical cavity and has as such enjoyed 
some attention in nuclear physics. For the sake of simplicity, we shall 
here maintain the acoustic interpretation. 

The solution, S, of the space part of the wave equation was shown in 
eq. (7-43) et seq. to be of the form 


Sk = Len Yi (6,0)? Z 141 /0(kr) (8-25) 
1=0 


As usual, k determines the frequency of the vibration: » = ku/2r, v being 
the velocity of the waves inside the spherical medium. Eigenvalues in kh, 
and hence in the frequency “spectrum,” are induced by the boundary 
condition 

Sr =0 at r-=a, the radius of the sphere 


According to (25), this is satisfied only if Zzņ}1;2(ka) = 0. Thus, for every 
integer l, there exists an infinite sequence of k; such that k,a is a root of 
Ziya- But Zi is a linear combination of J141/;2 and J_1_1/2, of which 
only the former can be retained because r!/?J_1_1/2 is always infinite at 
r = 0 and does not, therefore, represent a possible mode of vibration. 
Hence it is 

Jizije(ka) = 0 


which determines the eigenvalues of k. 
When? = Othesituation is very simple indeed, for Jyj;2(x) = V2 / nx sina. 
Thus the k’s are determined by sin (ka) = 0, which means that for this case 


nr ` 
kon = —> nan integer 
a 


The frequency spectrum is much the same as for the vibrating string. Let 
us now see what is the physical meaning of the condition! = 0. A glance 
at eq. (25) shows that Y7(@,¢) is a constant, and this means there are no 
radial nodes. The sphere vibrates in spherical symmetry. 

In addition to these eigenvalues, which have a linear distribution, there 
are the other sets given by Ji.ije(ki2a@) = 0. These are irregularly dis- 
tributed and interspersed between the Kon. 

The orthogonality of the S, (eq. 25) is at once evident. Orthogonality 
with respect to the index / arises from a property of the spherical harmonics 
proved in sec. 3.53. But even for the same l and different k the functions 
retain their orthogonality. The weighting factor in this case is r? because 
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the volume element contains this factor. Thus 


[Se Spar <f Jiyiralkir)d 1,2 (kor)rdr 


an expression which vanishes unless kı = kg asis seen from the last formula 
of sec. 3.9. 

By more special considerations it may also Le shown that the set of 
functions is complete in the sense that any f{(7,@,.e) which vanishes 
at r =a and is piecewise continuous can be expanded in the form 
LV Dee I ie(kiar). For the special case | = 0 this expansion 
l n 


reduces to a Fourier series. 
Problem. Compute the lowest 12 eigenfrequencies of the vibrating sphere. 


8.5. Laplace and Related Transformations.—A Laplace transformation 
of the function F(t) is 


f P(t)e-*dt = f(s) (8-26) 
The function f(s) is the Laplace transform of F(t). Symbolically, we write 
f(s) = £{F} (8-27) 


If, in eq. (26), we put t = ~—In z, we get 


G 1 
is) = O O 


This is called a Metlin transformation. 
If s = —ix, ey. (26) reads 


f(-iz) = S "Fedt 


It represents a Fourier transformation of the function f(—ix). In eq. 
(8-13) we have encountered a formula very similar to this, except that 
there the transformation was “ two-sided,” or bilateral, i.e., the integral 
was extended from — œ to +, and the function was called f(x) instead 
of f(—ix). In this section we limit our study to one-sided Laplace trans- 
formations, which are the ones usually en- 
countered in practice. 

We shall now derive a formula expressing 
F(t) in terms of f(s), i.e., a formula which 
represents the inversion of eq. (26). 

By Cauchy’s integral theorem, eq. (3-1), 

je = | fe" 


Fia. 8-1  Ontido z— s 


8.5 EIGENVALUES AND LIGENFUNCTIONS 260 


Suppose f(z) is analytic to the right of the line x = y (see Fig. 1). We 
ean then distort the contour C and integrate from y +ï% to y — ie, 
thence to the right to œ% —7, up to @ +i% and back to y fie. 
Only the part from y + io toy — i contributes to the integral. Hence, 

1 v~t f(z)dz 1 yti= f(z\dz 

-Ap Hsn = 

Qnidytin 2-8 B@rtdy-ie s—z 
Clearly, y must be smaller than the real part of s, in svmbols: y < R(s). 
To the last equation we apply the inverse operator &~', understanding that 


£-y(s) = FW: . 
1 pitie afl 
eye) =f peace (+) (8-28) 


&— z 


But we now show that £7! ( ) = et. We have 


£& (e**) = f ea 


ii 


1 if Re) > RG) 
s-z 


This condition is satisfied in eq. (28). Hence, 
1 yis 1 ytie 
FQ) = >f f(zje*dz = -f feds (8-29) 
Dre yt 2r yio 


Here y is any real number such that, to the right of R(z) = y (a vertical 
line through y), the function f(z) is analytic. Equations (26) and (29) 
represent a Laplace transformation of F(t) and its inverse. F(t) and f(s) 
are said to be a pair of Laplace transforms. 

When the function F(t) is changed to some other function, f(s) will 
likewise undergo a change. It is useful to study such correlated changes. 


Suppose f(s) = EIF (6)] 


Now “ operate” on the function F(t) with some operator P, converting it 
into PF(t). The transform of this function may be ealled pf(s), so that 


f(s) = LPF] 


We want to know what operator p corresponds to P. 
(1) Let P be a linear substitution: 


PF(t) = Flat—b), a20, b>0 
EIF (a — b)] = f F (ai — b) dt = l pèsla fJ Poea 
a — 


Insofar as the Laplace transformation is concerned, only the behavior of 
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F(t) for positive ¢ and zero is important. We shall now define F to be zero 
fort <0. Thus, 


if F vanishes for negative arguments, 


—bsla w 
[F (at — Hl = f f F (r) dr, 


a 
or 


—bdsfa 
EIF (at — b)1 = (2) (8-80) 


a 
The operator p which corresponds to our linear substitution is: multi- 
—bs 


. . € - . „S 
plication by — and substitution of z for s. 
a 


t 
(2) Integration. Let P = f at. 


od ft Ca 
We wish to find f die* [roar = f dte~*'@(t) where = = F(t). 


Integrate by parts, obtaining 


1 a ir” at 
— -ge +-f F(t) "dt 
§ 0 S/O 


The integrated part vanishes; hence 


e[ [roa] = * f(s) 


To integration, there corresponds division by s. Also, for iterated 
integration, one finds by repeated application of this formula 


£ I( S ORG = s"£[F] (8-31) 


(3) Differentiation. 


ro) F oo © 
F’ = f oe dt = eF | + sf e “Fat 
d 


sS[F] — F (0) = f(s) — F (0) (8-32) 


provided F (0) is the value of Fatt = 0. 
(4) Convolution. If Fı and Fo are both functions of é, the integral 


f Fy (Falt — r)dr (8-83) 


il 


is often denoted by F, * Fa and called the convolution or Faltung of F, and 
Fo. It is of frequent occurrence in physical problems. Suppose, for 
instance, that an error, ¢, is the linear result of two individual errors, 


8.5 EIGENVALUES AND EIGENFUNCTIONS 962 


é = a + e, and that we know the distributions, or probabilities of «e and 


ez. These are wi(e) and we(eg). The distribution of e is then clearly 


we) = SI. wy (e1)W2(e2)deidez = fu (4 )wole — ajde = w * we 
eyte =e 


The German word “ Faltung ” means folding; it arises from the follow- 
ing simple fact. If a line of length t be folded back in the middle, as in 
Fig. 8-2, the points adjacent to each other on the two segments are those 
which lie, respectively, at distances r and t — r from the origin 0. These, 


?@—__—_——_—*, 
t iY 
Fie. 8-2 


however, are the arguments of the functions F, and Fa that occur in the 
convolution integral. One final comment regarding this integral: If F, is 
defined to be zero for negative arguments, the upper limit, instead of being 
i, can be taken to be œ. 

The Laplace transform of a convolution is very easy to compute. 


SIF, * Fj = f "di f Fy(r)Fa(t — r)dr 


© t 
= f dr’ f dre +F] (r)Fa(r") 


= f Palo! ea f Flr) tdr 
if Fa(t) = Ofori < 0. 
Hence, 


LF, * Fa] = fife (8-34) 


The transform of the convolution is the product of the transforms of Fy 
and F2. Note also that convolution is commutative, 
| *F 2 = Fo * Fy 
and associative, 
(Py * F2) * Fa = Fy * (Fo * Fo) 
(5) Multiplication. To find 
LIF (t) - 10) 


we consider the triple integral 


Ipe pe -o 
zsef emn, (aan S eF (tz )atg 
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La Lys 1 > . . . . 
which is, by definition, on dzfı (s — iz)fe(@iz). On integrating over z, 
T - «© 


1 o 
there results =f erty = è (tta), the -function defined in (8-15). 
TF oa 


Hence the integral becomes 
i eo" Fy (ty )dty f Fa (ta2)ð (tite dle = f e*OF) (4) Fo(t dt 
o 


Therefore 


1 ” . . 
&(F F2) = = | ine — iz)fa(iz) 
If the variable is changed from iz to z, 
1 ts 
SFP) = == facile — Dl) (8-35) 
TL Y —i o 


The transform of a product is a convolution along the imaginary axis. 
For Fourier transformations, we have 


F[F] = SOO, FIF] = f era 


1 i . 1 æ o o 
=f ine- 2)fo(z) = — fie ON O 


The integration over z yields 3(t;,t2); hence 


a J teh (s = 2)f2(2) = f “oP (Fa (at = FF Fal 


. Ll . . 
The Fourier transform of a product is z, times the ordinary convolution: 
Tr 


If F (Fy) = fi F (F2) = Se, then 
FPP) = = f defile — fel) = s-fith (838) 
1472 Ird- o 1 2 Qn 1 2 


8.6. Use of Transforms in Solving Differential Equations.—A. Consider 
the differential equation 
Y’+#Y =0 (8-37) 


where the primes denote differentiation with respect to t. 
Multiply by f é “dt to obtain 
LY") + KPE) = 0 
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Now by eq. (32), 

&(¥"’) = s&(¥') — ¥*(0) 

£&(Y’) = s&(Y) — ¥(0) 
hence, 

LY”) = F&(¥) — ¥’(0) — s¥(0) 
If we write y(s) for £(Y), eq. (37) becomes 
sy — Y’(0) — s¥(0) + ky = 0 

Note how nicely the initial conditions on Y and on Y’ introduce themselves 


into the calculation! 
On solving, we obtain 


_ ¥'0)+s¥O)_ ay az 
ete — spik s—tk 
t ayr? 
where a, = E Y¥(0) + it 2], a = ARK — Ze 
By eq. (29) 
Y(t) = Sf. T y(s)etids 
Now 
1 yti eds 1 ytio e (s/—ik}t 


a = = ting s = ik 
Iide spik 2midy-io ds’ (on putting s’ = s + ik) 


Hix part 
= gikt i rese ds’ 
= z3 ~y ds’. 
Ti io sS 


This integral can be evaluated by the method of residues (see sec. 3.2). 
Since y must be positive in order that the singularity at s’ = 0 be avoided, 
we integrate along the square drawn in Fig. 3. The extension of the path 


p< --- «ix 


Lp -> —2% 
Fic. 8-3 


to close the contour is harmless, for the added parts contribute nothing to 
att 

the integral. The residue of kam within the contour lies at the origin and 
s 


265 USE OF TRANSFORMS IN SOLVING DIFFERENTIAL EQUATIONS 8.6 


equals 1. Hence, 
Aia st 
Tope eds in, 
A —— = e 
Qridy-i0e Stik 


1 . 
The other part of Y, coming from 7 yields e™!. Thus 

s— tk 
Y= aye it + age'** (8-38) 


The reader can easily verify that this is a solution, indeed the solution which 
satisfies the initial conditions. 
B. Consider the inhomogeneous equation 


Y” kY = F(t) (8-39) 
This leads to sy ~ Y'(0) — sY (0) + k?y = f(s) 
if f is the transform of F. 
Hence, 


= fls) + ¥’(0) + sY (0) 
y= k? + 9? 


We thus obtain, in addition to the solution of the homogeneous equation 


s 
EAR - This ean often be found 
in tables. Otherwise we proceed as follows: Let 


(88), a solution Y, whose transform is 


1 
f(s) =h, Paw = fə 


+ s“ 


Then, by theorem (3+), Y is the convolution of F(t) and F(t). F, is the 
inhomogeneity in the differential equation (39). 


We have 
1 A LO 1 ] 
LHe Qkls+tik s— ik 
Hence 
i 1 io gt et i . , sin ki 
RO= 2 S Jas = —ikt L piki] = Too” 
2) = OF on iS sw] = wl k 
! in k(t — 
Prr [no a 
0 


Problem. Solve the equation, Y” + 2bY + wy = fo sin at, by this method and 
compare the result with example a of sec. 2.8. 


Table 1 presents a list of Laplace pairs. Such tables have to be used 
with care, for it is not always easy to state explicitly the conditions under 
which the integrals converge. 
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TABLE 1 
FO f(s) 
1 
i iad 
8 
T 1 
t, (Rz > —1) = ) 
et 1 
s—a 
r(b +1) 
bat 
Pe oa 4 
s 
COs wt Taal 
sin wt ; be ; 
s“ bw 
s 
cosh wt ae 
sinh wt I 
&(t,r) evr 
_ OfOStgr ever 
rofians, z 
1 
ar? 
cos (£ Vt)/r Vi (z real) e = J~a 
. z z 
sin (£ Vt)/r (z real) — pähe 
ar 8? 
Jolt) a bg?) /2 
Jat); Rn) > -1 (LE BHA p g2) 1/9 — aP 
Ln(t) (See sec. 3.11) svg — 1)? 


More extensive tables of Laplace transforms may be found in G. Doetsch, Laplace- 
Transfarmation, Dover Publishing Co., 1948. 


See also: 

Carslaw, H. S., and Jaeger, J. C., ‘ Conduction of Heat in Solids,” Clarendon Press, 
Oxford, 1948. 

Murnaghan, F. D., “ Introduction to Applied Mathematics,” John Wiley and Sons, New 
York, 1948, 
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Jeffreys, H., “Operational Methods in Mathematieal Physies,” Cambridge Uni- 
versity Press, 1927. 

Widder, D., “ The Laplace Transform,” Princeton University Press, 1941. 

Magnus, W. and Oberhettinger, F., “ Formulas and Theorems for the Special 
Functions of Mathematical Physies,’ Chelsea Publishing Co., New York, 1949. 
This contains tables of Fourier, Laplace, Hankel, Mellin, and Gauss transforms. 
For Fourier transforms see: 

Carslaw, H. S., and Jaeger, J. C., “ Operational Methods in Applied Mathematics,” 
Oxford Press, 1941. 

Churchill, R. V., “ Fourier Series and Boundary Value Problems,” MeGraw-Hill 
Book Co., New York, 1941. 

Sneddon, I. N., “ Fourier Transforms,’ McGraw-Hill Book Co., New York, 1951. 

Titehmarsh, E. C., “Introduction to the Theory of Fourier Integrals,” Oxford 
Press, 1937. 

Wienor, N., “ The Fourier Integral and Certain of Its Applications,” Cambridge Uni- 
versity Press, 1933. 


8.7. Sturm-Liouville Theory——-Deeper insight into the nature of 
eigenvalue problems which arise in connection with second order differen- 
tial equations is obtained from a study of a theory at once simple and 
beautiful, the theory of the Sturm-Liouville equation. Nearly every 
eigenvalue problem encountered in physics and chemistry leads to an 
equation of the general form 


L(u) + rAwu = 0 (8-40) 

where the differential operator L is defined by 
L(u) = (pwy — qu (8—41) 
The quantities p, g, and w are understood to be functions of the independent 


variable x, and we shall suppose that w, which will soon be recognized as 
the former weighting function, satisfies 


w{z) > 0 


in the entire range of the variable z. This range is different in different 
problems, but it will be assumed to be finite and to extend from a to b. 
Finally, à is a constant; it will turn out to be the eigenvalue parameter. 
An operator?! of the form (41) is said to be self-adjoint. The necessary 
and sufficient condition for the genera! second order differential operator 


D(u) = fu” + gu’ + hu 


(in which f, g, and A are functions of x) to be self-adjoint is simply that 
g=f'. Eq. (40), however, is not a very special one. Every second order 


11 For a general definition of an operator and its adjoint the reader is referred to 
Courant-Hilbert, “ Methoden der Mathematischen Physik,” Vol. II, Second Edition, 
p. 434, or Frank, P. and v. Mises, R., “ Differentialgleichungen der Physik,” Vol. 1, 
n 720 
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differential operator D(u) can be made self-adjoint; it need only be multi- 


plied from the left by exp f g = f 


encountered in Chapter 2 may be written in self-adjoint form, and the 
theory we are presenting applies to them all. In Table 2 we list the factor 
F by which the equation named on the left, written in the customary form 
in which it appears in Chapter 2, must be multiplied in order to be self- 
adjoint, and also the quantities p, g, and w in (40). 

The function u is subject to boundary conditions. In the examples of 
the preceding sections these were of different types: in the problem of the 
string every u had to vanish at both end points, in the other problem it 
was to be finite at r = 0 but zero at r = a. Examination of these and 
many other examples (see Chapter 11) will show that the boundary con- 
dition in most problems of interest may be expressed in the uniform way 


dx. Thus all differential equations 


puu’ =0 


ł 
= puu 
b 


a 


for usually either p or u or u’ vanishes at the end points of the range. But 
it is egually satisfactory to state these conditions in a somewhat milder 
form: Let u and v be any permissible solutions of eq. (40); we then require 


a 


vpu’ 


b 


On the basis of this condition it is possible to establish the important 
theorem : 


f Leda = f toas (8—43) 


The proof is straightforward: 


b 
f vL(u)dr = foovyas = fros = vpu’ 


The first term on the right vanishes because of (42); the second may be 


b 
- fv'pwax — fonas 


ja 


ta 
transformed by another partial integration into -o'pa + f u(pu’)'dx, 
of which the first vanishes also. But the remaining integral, 
fiy — uge|dz, is nothing other than fear. The result (43) 


is often expressed by saying that the operator L is Hermitian with respect 
to functions satisfying condition (42). The importance of Hermitian 
operators will be more evident in the next chapters. 
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8.8. Variational Aspects of the Eigenvalue Problem.'*—Before pro- 
ceeding further, the reader is advised to review the main points of Chapter 6. 
It will be shown that the Sturm-Liouville equation (40) is the Euler con- 
dition which the function u must satisfy in order that (1) the integral? 


J (pu? + qu?)de = A (u) (8-44) 
take on a stationary value, (2) the function u be normalized: 
Jf wu’dz = 1 (8-45) 


The proof is simple. In the notation of sec. 6.5 we have, on writing 
Ai = —A for convenience, 


K =I — Al, = pu? + gu? — rw? 
and the Euler equation (6-15) is 


0K dK 

ðu drðw 0 (8-46) 
This is clearly identical with (40). The eigenvalue à here plays the role of 
a Lagrangian multiplier. We have thus seen that the process of solving 
the Sturm-Liouville equation is tantamount to a search for those functions 
u(x) which maximize or minimize A, subject to condition (45). This con- 
dition is important, for the integral A has usually only a single stationary 
value; but when eq. (45) is imposed A has numerous values each of which 
is stationary for a given neighborhood of functions u(x), although of course 
only one of them is an absolute minimum or maximum. 

Example. Let us see whether the procedure here outlined will actually 

lead to a simple type of function defined by a Sturm-Liouville equation, say 
the Legendre polynomial. We start by assuming 


u = a -+ bz + er? 
with a, b, and c unknown. From Table 1 we see that p = 1 — z?; hence, 
1 
A= J (1 — x?) (b? + 4bex + 4c7x*)dx = 4b? + He? 
—1 
We require that 
fei = 2a? + $0 + 2ac) + 2c? = 1 
12 The development in this and the following sections leans heavily on Courant- 
Hilbert, “ Methoden der Mathematischen Physik,” Vol. I, Second Edition. 


1 Henceforth in this chapter limits of integration will not be indicated when the 
range is from a to b. 
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Thus it is necessary to minimize 
$b? + He? — \[2a? + 200? + 2uc) + Bc?) 
by choice of a, b, and c. On putting the partial derivatives with respect to 


a, b, and c equal to zero and finally rewriting the normalization condition, 
four equations are obtained for the determination of the quantities a, b, ¢, 


and i: 
A (a + S =0 (1) 
b(2 — 4) = 0 (2) 
c(& — 8d) — 5ad = 0 (3) 
2 
a? + 1(b? + 2ac) + s =4 (4) 


Suppose we put c = 0. Then, according to (1) and (3), aà = 0, while (4) 
yields a relation between a and b. Hence we can put either a = 0 or 
A = 0. In the latter instance, i.e., if 


= 0 


we get from (2), b = 0, and from (4), a = VE. In the former instance, 
namely a = 0, (2) yields 

N= 2 
and (4) gives b = V3. 

Now instead of assuming c = 0, let us take b = 0. Consistency then 
requires that neither a nor c nor à can be zero. Hence we find from (1), 
c = —3a, and from (8) and (4), 

n= 6 


anda = V5. We have thus determined altogether three solutions, corre- 
sponding to three possible values of A: 


` u 

0 v$ 

2 V3z 

6 VŽ — 32?) 


The reader will notice that the \’s are the first three values of I + 1), and 
the u’s the first three normalized Legendre polynomials. We now return 
to eq. (40) in its general form. 

Let us assume for definiteness that the extremals of A are minima; the 
argument to be presented is equally valid when they are maxima. Also, 
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let ui (x) be that function which produces the lowest minimum of A while 
satisfying (45), and jet A, be the eigenvalue corresponding to it. We now 
seek a function us(x) which will also produce a minimum of A and satisfy 
(45), but which, in addition, shall be orthogonal to u: 


fonua =0 (8-47) 


The Euler equation for us is more complicated than that for u, since ue 
must satisfy two accessory conditions and u; only one. In fact, 


K= pus + guy — howus — wwruyzus 
u being a new Lagrangian multiplier. Hence eq. (46) becomes 
Qquy — ZWowu, — uwur — 2(puz)’ = 0 
and this is identical with 
L(ttg) + Agwitg + tuwu, = 0 


To determine the value of u we multiply this equation by u; and inte- 
grate, making use of relation (48) which, of course, we require u, and us 
to obey. The result is 


foL +o founde + $u foia = 0 (8—48) 


Here the first term is —M f wu;u2dx because u; satisfies (40), and this 


equals zero because of (47). For the latter reason, the second term of 
(48) also vanishes. But the integral appearing in the third term is cer 
tainly finite. Hence, we conclude that the multiplier u = 0; we might as 
well not have required relation (47): ug satisfies the same equation as ti 
but for a different eigenvalue ào. Moreover, it is automatically orthogonal 
to ui. 

This process may be continued. Suppose we seek a function u3 which 
will minimize A, subject to the three conditions 


J wuadr = 1, f wuugdr = f wigusdx = 0 


The minimum thus obtained will lie at least as high as that produced by wa, 
for the choice of functions has been further restricted. The quantity K 
appearing in Euler's equation now contains three undetermined parameters, 
àz, u, and v. The last two of these may be shown to vanish by a method 
similar to that above. By further extension of this process we are led to 
this result: If we desire a set of functions which (1) minimize A, (2) are 
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normalized, (3) are mutually orthogonal, they are found as solutions of 
eq. (40). 

Conversely, it is easy to show that all solutions of (40) belonging to 
different eigenvalues are orthogonal. To do this, one need only multiply 
two specific forms of (40): 


L(u;) + Au; = 0, L(u;) + Awu; = 0 


by u; and u; respectively, integrate each equation and subtract. When 
(43) is used, the result is simply 


dy — 23) f wusujde = 0 (8-49) 


Hence, either M; = Aj, or u; and u; are orthogonal. 

The case in which A; = A;, where two (or more) eigenfunctions belong 
to the same eigenvalue, is not of very great interest under the simple con- 
ditions we are here considering (real eigenfunctions, one independent vari- 
able), It is very much more important in the more general eigenvalue 
problems of Chapter 11. As to terminology, whenever several eigen- 
functions, i.e., linearly independent eigenfunctions, are associated with one 
eigenvalue, that eigenvalue is said to be degenerate. 

Tt may seem strange to find eq. (49) predicting orthogonality only for 
non-degenerate cases, while the variational argument of the preceding para- 
graphs implies no restriction of this sort. Harmony is restored when we 
realize that a set of linearly independent solutions of eq. (40) can always be 
combined in such a way as to form an equally numerous, equivalent set of 
orthogonal solutions. (Cf., for instance, the method of Schmidt," sec. 
10.8). Hence we may, if we. like, speak of the orthogonality of all solu- 
tions of eq. (40), assuming tacitly that the process of orthogonalization 
has been carried out on all sets of functions belonging to a degenerate 
eigenvalue. 

One further point is to be made in connection with the variational 
property of the solutions of the Sturm-Liouville equation. We have seen 
that the u; minimize the integral A. What are the stationary values of A 
thus produced? Let us compute them. 


b 
f (pu? + quf)dx = upu | — f [u(puy)’ — uquildz 


A(u;) 


uL(u;)dx = >; | wugde =r, (8-51) 


The simple and interesting answer is, then, that the stationary values of A 
are the eigenvalues Ms. 


14 The use of this method for functions instead of vectors is illustrated in Lindsay 
and Margenau, “ Foundations of Physics,’ John Wiley and Sons, p. 425. 
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Example. Degeneracy arises when, in the vibrating string problem 
expressed by eq. (2), one replaces the ordinary boundary conditions (a) 
and (b), sec. 2, by one requiring only periodicity: 


S(a) = S(b) (8-50) 


The eigenvalue parameter, A, in this equation is k®. Moreover, it is to be 
noted that the periodicity condition (50) conforms to our general require- 
ment (42). The solution satisfying (50) is easily seen to be 


S = A sin (8+ 2) 


where ? = b — a, and ô is arbitrary, the quantity k taking on the values 
2mn/l. But n may be a positive or a negative integer. Hence to the 
same value of X°, namely 4r°n?/l?, there correspond the two functions 


2 2 
Sy = A sin (a +72" 2) Se = Ag sin (5 ~ a) 


Except when ô is an integral multiple of r, as it must be when the ordinary 
boundary condition is imposed, S; and Sz are linearly independent. Yet 
they are not orthogonal (except in the special case when 6 = 7/4). It is 
easily seen, however, that if we put 


Zi = {Bain (0+ 222) 
l l 
Le = raw | sin (6 + 2) ~ sin (3 - =") 


wherein s = sin? 6 — cos” ô, we have a pair of functions, satisfying the 
differential equation for the same k’, which are both orthogonal and normal. 


l 
Problem. The integral A (u) for the differential equation u” + k?u =Ois f (uw dz. 
o 


Assume for u any normalized polynomial containing the factors z and z — 1, and show 
that A computed for this u is greater than the lowest eigenvalue r? 1. 


8.9. Distribution of High Eigenvalues.—Preceding considerations indi- 
cate no uniform law according to which the eigenvalues of any differential 
equation are arranged; regularity does, however, prevail for the “high ” 
eigenvalues, as will now be shown. Let all \’s be arranged in numerical 
order, so that dy is the lowest. Although no proof of the existence of an 
infinite number of eigenvalues has here been given, their variational mean- 
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ing strongly suggests! and the examples confirm this expectation. We 
shall now prove the theorem: 
lim An = const. n? (8-52) 


Timm wo 


Under the substitutions 


z 1/2 
z= (pw) tu, =f E) dx 


the Sturm-Liouville equation takes the form 


d*z 
eT fOÐz +M =0 (8-53) 
Detailed consideration which may be left to the reader shows that the 
function f(t) is bounded, 

Now consider, in place of (53), the differential equation 


g? 
2 +2 =0 (8-53’) 


. .. * (de? . 
Its eigenvalues are minima of f () dt, where r is the value of ¢ at 
o 


b 1/2 
x = b, namely 7 = f (°) dx. On the other hand, the eigenvalues of 
a P 


LG) +] 


(53) are the minima of 


Thus 


Assume that z’ is the specific function which produces the minimum A’, 
whereas z produces à. If we compute à using z’, we shall obtain a value 
for the integral that is greater than its minimum, A. Hence, 


F fdz’\? 7 
A< f (=) + fe” | dt= n+ f fer dt 
o di o 
But since f is bounded and z’ is normalized, f fedt has some finite value 


15 Suppose un produces the minimum ìà». Of the function Uni We require that it be 
orthogonal not only to the n — 1 functions with respect to which un has this property, 
but also to u, itself. Hence the class of functions from which u,p41 must be chosen is 
more restricted, and the minimum produced by un+1 cannot lie below An. Now there is 
an infinite number of functions orthogonal to the set Us, u1*-* Un, and it is hard to be- 
lieve that they will all produce the same eigenvalue àn. 
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F’, so that 
A<N4+F" 


. ` ` f . 
If we proceed in the reverse manner and use z in computing À , we obtain 


2 
v< f (F) dsr- [fdr =r- F 


where F is again finite. Upon combining the last two inequalities we find 
XHESASN AF 


which means that à can differ from \’ by only a finite amount. If the X- 
values tend to œ, the A’s also do. 

But the eigenvalues of (53’) are well known. They depend, of course, on 
the boundary conditions for z, and hence for u. If u vanishes at both a 
and b, so that z vanishes at 0 and 7, the eigenvalues are 

22 


, n T 
A, = — 


In case only periodicity of z is required (see example of preceding section), 
the eigenvalues are 
x! 4n? 7? 
n T 72 


In any case, 
x. = const. n? 


Since the “ high ” eigenvalues \ approach the “ high ” values of A’, theorem 
(52) is established. It is to be observed that our result in this particular 
form is conditional upon the assumption of a finzte r, which is usually equiva- 
lent to a finite range of x. Several of the equations listed in Table 2 are 
ordinarily treated for infinite ranges of the independent variable; for 
these, theorem (52) is not valid because 7 becomes œ. Hermite’s equa- 
tion is a case in point: its eigenvalues are proportional to n rather than 
n? even asymptotically. But here, as well as in all other cases, it is still 
true that 

An © as n= w (8-54) 


It is interesting to note that the solutions of eq. (53°) are asymptotically 
(for large X) equal to those of (53). Thus 


lim Zn = A, sin Z; 

n— 0 T 
provided the boundary condition is: 2(0) = z(r) = 0. In terms of u this 
reads 


som = nors ELO T 
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8.10. Completeness of Eigenfunctions.—In sec. 2 there appeared a 
qualitative, although crude, definition of completeness. We now wish to 
give that definition greater precision and to prove it under the conditions 
outlined in sec. 8.7. A system of functions uy, us, -++ is complete if it is 
possible to “ approximate in the mean ” any function f(x), satisfying the 


n 


same boundary conditions as the w’s, by means of a series Ẹ azu, that is, if 


n 2 
lim (1 — 2 arus) wdx = 0 (8-55) 


na oO 


We are here concerned with functions u which are solutions of eq. (40); 
hence we know them to be orthogonal. This permits at once the determi- 
nation of the coefficients a;. If, for any given, finite n, we wish to make the 
quantity 


N= (f — L atu) wda 
1 
as small as possible, then 


aN 
= =0 for j=1,2,--+n 
0a; 


The differentiations may be carried out under the integral sign, so that 
Sh -} caus sn = 0 
i 


a; = f fusode (8-56) 


This, then, is the best choice of coefficients with which we may hope to 
satisfy (55). 
Now introduce the following abbreviations 


n 1/2 
A, =f—- hats, Cr = | f Suas 
l 


We shall show that the function A,/e, has the following properties: 
(1) it is normalized, (2) it is orthogonal to every u; up to and including 
Un. The first property is obvious; the second is easily seen as follows: 


A. n 
f (2) ujywds = 1f f fuiwde — È aj f uawas) 
Cn Cy | fel 


L a; — a) if i<n 
c 


n 


whence 


l a0) if >n 


Cn, 
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But if A,/c, has these two properties, it satisfies all the conditions which, in 
the variational prozedure, we imposed upon unis, excepé that of minimizing 
A. Hence it is clear that 


A (=) > Alun) 


Cn 


and this means: 


1 
“2 A(An) 2 Ant (8-57) 
Ch 
The remainder of our argument consists in proving that A(A,) is finite. If 
the reader will accept this fact, 18 which is almost obvious from the meaning 
of An, the last inequality leads at once to (55); for as n approaches infinity, 
the right-hand side tends to infinity in view of (54), hence 
lim È = 0 


This is the same as (55). 


18 For the more exacting reader, we here indicate the proof. The integral A(A,) 
may be transformed in accordance with the first three steps of (51) into 


A(n) = =f stands 


But 
f Anl(An)dz = Jf G- Paud L gy — L au)dr = f fL(Padz + Eai 
because 
Luz) = —Anwu;, and foroa = fiue = dih 
Hence, 


mh 


MAn) = Af) E ai 


{i= 


The existence of A (f) must be assumed, for otherwise an expansion of f in terms of the 
ws may be impossible. Moreover, f and therefore the approximating function 


n 
Yn = L aru; must possess integrable squares. Let us suppose that 
t= 
n 
fenas =} 2 =M, 
i=l 
n 
If we add zero in the form Yaa, — Mnà, where à, is the lowest of all eigenvalues, to 
i 


the last expression for A (An) we obtain 
A lAn) = AQ) -EO — a) Mra 
l 


The difference A (f) ~ M,)1 is certainly finite for all n. Letuscallit A. The summa- 
tion on the right consists of positive terms only. Inequality (57) may therefore cer- 
tainly be written 

A 

ma) > Anti 

Cn 


This forces c, to become zero for large n since Anyi tends to œ. 
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8.11. Further Comments and Generalizations.—In the last section we 
have shown, not that 


f= Dan, (8-58) 


t=] 


but rather that the series on the right approximates f “in the mean ” in 
accordance with eq. (55). To put the difference more concretely: Eq. (55) 
may be true and yet (58) may not hold for all points of the range 
(a<2< 5). It is clear that if (58) is true almost everywhere but fails 
at a finite set of points, the contribution of these points to the integral in 
(55) would be nil and that equation would be true. To prove (58) in 
addition to (55) would involve the establishment of absolute and uniform 


convergence of the series È a,u;. For the solutions of eq. (40) with 


1 
boundary conditions of the type here chosen this can indeed be done,” and 
the reader need not be excessively concerned over the difference between 
“completeness ” (expressed by eq. 55) and the possibility of expansion of 
an arbitrary function (indicated by 58). 

The preceding theory has always involved the assumption of a finite 
range, b-a, of the independent variable. This is clearly a serious limita- 
tion, for it excludes the usual solutions of a number of the equations listed 
in Table 2, To develop a rigorous account of the situation arising when 
the range is extended to infinity is not easy, but what happens qualitatively 
under such conditions can be readily seen. 

Consider again the vibrating string with eigenvalues k? = n?x?/2?. 
As | tends to infinity, these eigenvalues move closer together until in the 
limit they form a continuum. The eigenfunctions are still of the form 
A sin (kx + 8), but they refuse to be normalized in the former sense; for 


clearly the integral f sin? kazdz, when taken over an infinite range, 


diverges. Also, since the eigenvalues are no longer discrete, our definition 
of orthogonality loses its sense. However, completeness is still guaranteed 
since what was originally a Fourier series will now become a Fourier inte- 
gral (cf. eq. 14). The difficulty concerning orthogonality and normaliza- 
tion can, however, be avoided by introducing “‘ eigendifferentials ” instead 
of eigenfunctions.*® 

The situation brought about by an extension of the range may be even 
more complicated than this, We shall see in Chapter 11 that the differen- 
tial equation describing the hydrogen atom (eq. 11-55), which is closely 


See Courant-Hilbert, p. 370. 
IR See, for instance, Kemble, E. C., “ The Fundamental Principles of Quantum 
Mechanics,” McGraw-Hill Book Co., 1937, p. 162 et seq. ` 
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related to Laguerre’s, admits, because of its infinite range, both a discrete 
and a continuous set of eigenvalues (‘‘ spectrum ”). This phenomenon is 
of very frequent occurrence. On the other hand, eigenvalues associated 
with a Sturm-Liouville problem of infinite range are not necessarily con- 
tinuous, as the example of the simple harmonic oscillator (cf. Chapter 11) 
or Hermite’s differential equation (eq. 2-62) clearly shows. 

No mention has thus far been made of the possibility that the solutions 
of the Sturm-Liouville equation may possess singularities in the range 
a<x<b. Troubles of this sort might have been circumvented by 
postulating that the function p appearing in eq. (41) be always of one 
sign and never zero, as is sometimes done in treatments of the eigenvalue 
problem. This, however, would have excluded some interesting cases from 
Table 2, notably Legendre’s equation which has (non-essential) singular 
points at x = +1, and Hermite’s equation which has an essential singu- 
larity at œ. Suffice it to say here that these matters, although of consider- 
able fundamental interest, occasion no modification of the conclusions here 
derived. Attention is given to them in Kemble’s book (loc. cit.). 

The solutions of eq. (40) have been assumed to be real functions 
throughout this section. If the functions p, qg, and w are real, this entails 
no loss in generality. Suppose that a complex function u = X + iY 
were admitted as solution of the differential equation'’: this would merely 
imply that both X and Y are real solutions belonging to the same eigen- 
value. Thus, whenever complex solutions arise and are compatible with 
the boundary conditions, we may at once conclude that the correspond- 
ing eigenvalue is degenerate. (In the complex scheme, both u and 
u* = X — iY are linearly independent solutions.) If now we require as 
normalizing condition 

f uřuwdz = 1 


we are merely postulating that, in place of the usual normalization 


( [Xut = [Vode = 1), 
feud + fois =] 


shall hold. In other words, we are operating, in the complex scheme, with 
linear combinations of the real functions, and with a different normaliza- 
tion. Orthogonality, if defined by eq. (5*) instead of (5), reverts to its 
ordinary meaning, for 


fitua = fix + YiıYo + iX Yo — 1Y1X2]wdz = 0 


We may, for instance, write the solution of eq. (2) in the form S = Ae™**. 
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is an immediate consequence of the fact that eigenfunctions belonging to 
different eigenvalues A; and ^g are orthogonal. Furthermore, if we require 
u* and u to be orthogonal, 


fuas = fo — Y?4 WXY)wde = fui — f Pui = 0 


provided X and Y have been chosen orthogonal. Thus both X and Y are 
normalized to 4 when the complex formalism is used. In view of these 
simple facts the validity of the completeness proof remains intact 
for complex f and complex u; only formal changes are necessary. 
The a; become complex, and completeness is defined by the relation 


lim | AZA,wdz = 0. Complete revision of the theory is necessary when 


the coefficients p, g, w are permitted to be complex. 

Finally, it is appropriate to remark that our development has been 
restricted to one dimension. The Sturm-Liouville theory can be gener- 
alized without great difficulty to certain partial differential equations with 
much the same results. For this generalization we refer the reader to 
Courant-Hilbert. 

Eigenvalue problems arise in the most diverse fields of physics and 
chemistry. Many of them are treated in: 


Morse, P. M., and Feshbach, H., ‘‘ Methods of Theoretical Physics,’ McGraw-Hill, 
New York, 1953. 

Jeffreys, H., and Jeffreys, B. S., “ Methods of Mathematical Physics,’ Cambridge 
University Press, Second Edition, 1950. 
See also the bibliography on quantum mechanics. 


CHAPTER 9 
MECHANICS OF MOLECULES 


9.1. Introduction.—As an illustration of the mathematical methods 
used in mechanics, we discuss in this chapter an important physical and 
chemical problem, namely, the motion of a molecule containing n atoms. 
We limit ourselves to this single topic for several reasons: its complexity 
requires us to describe most of the mathematics used m mechanics; the 
same methods may be extended to other problems, for example, the 
motions of particles within the atomic nucleus or the motions of a macro- 
scopic body such as an aeroplane! ; and finally because the structure of the 
polyatomic molecule and its spectra are matters of considerable interest to 
many chemists and physicists. This chapter will also present an oppor- 
tunity for dealing with the purely mathematical question of how to 
describe the configuration of a rigid body (Euler’s angles, etc.), a matter 
which is of great generality and must be included in a survey of mathemati- 
cal methods used in science. Many adequate accounts of classical mechan- 
ics? exist so that we dono more here than recall briefly some of the principles 
of that subject before proceeding to the special problem in which we are 
interested. 

9.2. General Principles of Classical Mechanics.—A free particle is one 
whose motion is completely unrestricted. It is said to have three degrees 
of freedom, for its position is uniquely determined at any instant by three 
independent coordinates. Consider a system containing n such particles, 
where the instantaneous position of the z-th particle of mass m; is specified 
by the vector r; If F; is the vector resultant of all the forces acting upon 
the particle then the motion of the system is described by Newton’s equa- 
tions which may be written in the form 

d'r; M ` 
migz = Mats = F; (@=1,2, sn) (9-1) 

In many cases, the particles composing the system are not free but 

restricted. For example, a member of the system may be allowed to 


l See Frazer, R. A., Duncan, W. J., and Collar, A. R., “ Elementary Matrices,” 
Cambridge University Press, 1938. 

2 Whittaker, E. T., “ Analytical Dynamics of Particles and Rigid Bodies,” Third 
Edition, Cambridge University Press, 1927; Corbin, H. C., and Stehle, P., ‘‘ Classical 
Mechanics,” John Wiley and Sons, Inc., New York, 1950; Goldstein, Herbert, ‘ Classical 
Mechanics,” Addison-Wesley Press, Inc., Cambridge, Mass., 1951. 


eon 
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move only on a surface, so that its degrees of freedom become two. Under 
such circumstances the equation of the surface is called the consiraini. Ina 
similar way if the particle is required to move aiong a line, there is only one 
degree of freedom and the two equations which define the line are the 
constraints. If the sum of the degrees of freedom of all the particles is 
k < 3n, then the system may be regarded as a collection of free particles 
subjected to 3n — k independent constraints so that only k coordinates 
are needed to describe the motion of the system. These new coordinates 
Qi» 92) °* > qr are related to the Cartesian coordinates of the particles (ef. 
eqs. 5-1 and 5-2); they are called the generalized coordinates of Lagrange. 

If, for convenience, we let the Cartesian components of rı be x1, Xe, £3; 
the components of rp be 24, £s, %6 and so on (remembering also that 
m = M, = mg; M, = Mg = Mme; ete.), then the kinetic energy T of the 
system is given by 


2T = pa mae = 2 A ArsGrGs (9-2) 
where an 
oG Aw S E mae ag (3) 
Since the components of momentum in Cartesian coordinates are 
p ôT 
i 5 Mili = Fi 


we define, by analogy, the generalized momenta as 
k 


_: or 
P(g t digas) = a, =2A rade (9-4) 
In many physical problems, the system is conservative, that is, a poten- 
tial function V (9y,g2,° ° sqr) exists such that 
Qi = — — ; (i = 1, 2,--+, k) (9-5) 


Then, as was shown in sec. 6.3 (ef. eq. 6-11), Lagrange’s equations of 


motion are 
ap @=1,2,---k 9-6 
z 5-2 -Qu G ) (9-6) 


This is a set of k differential sauations of second order with g1, Q2, ***, gx 
as dependent variables and t as independent variable. 
If we introduce the Lagrangian function 


L(qudi) = T (gids) — V (ai) (9-7) 
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eq. (5) becomes 
d faL ðL 
—(—)}—-—=0; @=1,2,---,k 9-8 
aaa) ~n 7% es) 
The solution of Lagrange’s equations in either form (5) or (8) will result 
in an expression for each generalized coordinate q; as a function of time and 
2k constants of integration. The latter must be determined from the 
initial conditions of the n particles of the system. 

It is often of advantage to transform (5) or (8) to a set of 2k first order 
differential equations. From (4), (8), and the definition L = T — V, 
we have 


P: =; Bi = (9-9) 


We now define the Hamilionian function 


k 
H = Spd: — L (9-10) 
i=] 
Tts total differential is 
k k k ðL k aL 
dH = £ pid: + E ddp: — 2a — dai — ES = då; 
i=1 t= i=1 09; i=1 OO; 
But by using (9), the first and last terms cancel, giving 
k k ðL , 
dH = = ġidPpi — pe ag: dg; (9-11) 


This equation depends only on dp; and dq; but not on då:, hence H is a 
function of q and p alone and we may write 


k ðH k 0H 
dH = $ — dp: + E — dq; (9-12} 
i=1 OD; i=1 0: 
Comparison of (11) with (12) shows us that 
oH oH aL 
7 =a sera He mh; G = 1,2,---+k 3 
aD; q aq: ag: P (G 1 ; ; ) (9-1 ) 


The resulting first order differential equations (13), 2k in number, are 
Hamilton’s canonical equations of motion; pi and q; are said to be canoni- 
cally conjugate variables. 


Problem. Show that 27 = D p:ġ: and H = T + V. 


9.3. The Rigid Body in Classical Mechanics.—As a crude first approxi- 
mation to the motion of a molecule we consider a rigid body which is defined 
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as a system of n particles bound together by interior forces in such a way 
that the distance between the i-th and j-th particles is constant and 
unaffected by any external force to which the system is subjected. Sup- 
pose £i, Yi 2; are the Cartesian coordinates of the i-th particle, then the 
distance between the i-th and j-th particle is 


rg = V (zi — z)? + (ys — ys)” + (z: ~ 2)? = constant (9-14) 
(Qj = 1, 2, n) 


It is readily shown that the most general displacement of a body of this 
sort may be obtained in a variety of ways by a combination of translation 
and rotation about an axis fixed in the body. The proof of this fact, known 
as Chasles’s theorem, may be found, for example, in Whittaker, loc. cit. The 
choice of a reference point, that is, the origin of the vector which locates the 
fixed axis, is entirely arbitrary. For a given displacement, this point may 
be chosen in such a way that the translation is parallel to the axis of rota- 
tion. With this choice of reference point, each displacement can be effected 
in one and only one way, the resulting motion being similar to the displace- 
ment of a nut on a threaded screw. It is thus only necessary to consider 
translation and rotation in order to study the most general motion of a rigid 
body. It should be remembered, however, that the axis of rotation may 
be continually changing its direction, hence we usually refer to an instan- 
taneous axis of rotation. 

9.4. Velocity, Angular Momentum, and Kinetic Energy.—Suppose a 
rigid body is rotating about an axis with a constant angular velocity œ; 
then the linear velocity of any point P in the body is given by 


v=oxXr (9-15) 


where r is a radius vector drawn to P from a fixed point O on the axis of 
rotation (see eq. 4-16). If the point P has a mass m, its momenium is 


my = mlw X r) (9-16) 
and its moment of momentum or angular momentum. (see sec. 4.5) about the 
point O is 

M =rX mv = merx lo xr) (9-17) 
Suppose the fixed point O about which the body is rotating is taken as the 
axis of a Cartesian coordinate system OXYZ, the components of œ are 
Wz, Oy, @z and the components of r are x, y, z Then in accordance with 
eg. (4-13), the components of v are: 

Ds = Zy — Yaz 

Uy = Laz — Zaz (9-18) 

V, = YOu — Tay 
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and the components of M are: 
Ma = myo: — zty) 
My = m(zuz — 2w) (9-19) 
M, = m(avy — yz) 
On combining (18) and (19) there results 
M, = Aw, — Foy — Baz 
My = Ba, — Da; — For (9-20) 
M, = Coz — Eos — Dwy 
vhere A, B, C are moments of inertia and D, E, F are products of inertia: 
A =m? +2"); D = myz 
B = m(2 +27); E = mer (9-21) 
C = mx? +4?); F = may 
The kinetic energy T of the particle at P is given by 
2T = mv-(e X r) = miver] = mlorv] 
mo: (tr Xv) =oa-M (9-22) 


It 


where we have used eqs. (4-17), (4-18), and (9-17). Thus, in view of 
(20) we find 


2T = Aw? + Bue + Cu? — 2Dayo, — 2Eww, — 2F wey (9-23) 


9.5. The Eulerian Angles.—We digress here to give explicit relations 
useful for locating a point P in a rigid body. Six parameters are needed. 
Three of them will specify a fixed reference point in the body, which is not 
necessarily at the origin of the coordinate system as in the preceding dis- 
cussion. ‘Two more parameters are required to define the position of a line 
fixed in the body and passing through the fixed point, while the sixth 
parameter defines a rotation of the body about this line. 

Suppose we attach a rigid framework O’X’Y’Z’ to the body and 
denote the position of its origin relative to a coordinate system OXYZ 
fixed in space by xo, yo, zoo We will also suppose that we know the nine 
direction cosines a;; of O’X’Y’Z’ relative to OXYZ. The point P may 
then be located in either coordinate system at will for we have the relations 
(see sec. 4.1) 


i f i 
LZ = To + aut + ayy + ag 

f t l 
Yy = Yo + a21% + azzy + aog3z 

f ad i 
z = Zo + agı% + Agay + a33? 
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where x,y,z refer to OXYZ and z',y' z" refer to OX'Y'Z'. Let us choose 
Xo,Yo,20 as three of the parameters required. The nine direction cosines 
which remain, and which we know are not nearly independent, may then 
be combined in a variety of ways in order to obtain the three additional 
independent parameters needed. Some useful combinations are the Kuler- 
Rodrigues parameters, the Cayley-Klein parameters? and the Eulerian 
angles. The latter are suitable for the present purpose and will now be 
described. 

Unfortunately, the Euler angles have been defined in several different 
ways in the scientifie literature and great confusion occurs when one 
attempts to compare the results of various writers. The one which we 
adopt in the following is that favored by the majority of more than fifty 
references’ which have been consulted. A possible advantage of it lies 
in the fact that our angles œ and 8 become the polar angles, @ and 9, 
respectively, m spherical polar coordinates. Moreover, a rotation about 
the OX-axis toward the OY-axis, as is required in the second step of our 
procedure, seems to be a natural operation. This step does, however, 
mtroduce additional imaginary faetors into the Cayley-Klein parameters 
and the representations of the three-dimensional group (sce sec. 15.15). 
Further complications also result when one compares the wavefunctions 
of the asymmetric top in quantum mechanics with those of its limiting case, 
the symmetric top. These latter objections are removed if the second 
rotation is made about the OY-axis, instead of the OX-axis, as is done by 
Whittaker (loc. cit.) and by Wigner. Note, however, that Wigner has 
used a left-handed coordinate system. 

Let us return to the problem of describing the Eulerian angles, which 
we show according to our definition in Fig. 1. Perhaps a clearer conception 
of the relations involved may be obtained from the cross-section diagrams 
of Fig. 2, which give the planes XOY, ZOZ’, and X’OY’ and which show, 
in parentheses, the axes perpendicular to the plane of the page. It will 
be seen that the axis OK, called the line of nodes, is the intersection of the 
XOY and X’OY’ planes. The axis OL is perpendicular to OK in the XOY 
plane, and OM is perpendicular to OK in the X’OY’ plane. Study of 
Fig. 2 will show that OXYZ may be superimposed on OX’Y’2! by the 
following rotations, provided that they are performed in the order given 


3 The Cayley-Klein parameters are related to the Pauli spin matrices usod in quantum 
mechanics, as will be shown in sec. 15.15. 

“It agrecs with that chosen by Goldstein (loc. cit.), who has also commented on the 
conflicting definitions of the Euler angles, In his notation, our symbols are: a = 4, 
B=, y=y. It should he noted that our present equations differ from those in the 
first. edition of this hook, since we inadvertently used a left-handed coordinate system 
there. 

® Wigner, E., “ Gruppentheorie und ihre Anwendung auf die Quantenmechauik der 
Atomspektren,” Friedr. Vieweg und Sohn, Braunsehweig, 1931. 
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and always in a counterclockwise direction: (1) rotate about OZ by the 
angle a; (2) rotate through 8 about OK (OK and OZ are now identical 
because of the first rotation) which will bring OZ into coincidence with OZ’; 
(3) rotate about OZ’ by y which then brings OX to OX’ and OY to OY”. 

Relations between OXYZ and OX’Y’Z’ may be found most simply by 
matrix methods (see Chapter 10). Suppose a vector x in the space-fixed 
system becomes x’ in the body-fixed system; then the matrix which con- 
nects the two vectors is R(a,8,y), where 


x’ = Rle,Byy)x 
and 
R(a,8,7) = R,(y) R.(8) R: (a) 


The first and last of these matrices are like the matrix R* of sec. 10.17, 


289 ABSOLUTE AND RELATIVE VELOCITY 9.6 


with y or ain place of 6. The remaining matrix is similar in form but re- 
arranged to represent a rotation about the O.X-axis. 

When the matrix product is evaluated, the result is that given in 
Table 1. It should be interpreted in a manner similar to that of Table 4-1. 


TABLE 1 
OX OY OZ 
ox’ COS a COS Y sin @ cos y sin £ sin y 
—sin æ cos f sin y +cos a cos £ sin Y 
$ ʻ ` : . 
oY —cos @ sin y —sin æ sin y sin 8 cos y 
—sin æ cos 8 cos y +cos a cos B cos y 
oz’ sin æ sin £ —cos a sin B cos g 


Tn order to obtain the angular velocity in terms of the Euler angles, it 
is convenient to use the body-fixed system, OX’Y’Z’, with components 
à, 8, and y along OZ, OK, and OZ’, respectively. Since & is parallel to the 
space-fixed axis, OZ, its components are given by the last column of Table 1. 
The components of 8, which is parallel to OK, may be found from the first 
column of the matrix R,(y). Fimally, since y is parallel to OZ’, its only 
component is y. Collecting these results, we have 


ws = sin 8 sin yx + cos yê 
sin 8 cos ya — sin v8 (9-24) 
o, = cos B& + + 


€ 
in 
ll 


for the three components of angular velocity along OX’, OY’, and OZ’. 
In terms of the Eulerian angles, the kinetic energy of a rotating symmetric 
top (A = B), which we shall need later, is seen from eq. (23) to be 


T = (AB? + Aa? sin? 8 + C(y + à cos 8)" (9-25) 


provided we choose OX’, OY’, and OZ’ to coincide with the principal axes 
of inertia of the top, for then the products of inertia D, E, and F all vanish. 

9.6. Absolute and Relative Velocity——We now return to a more 
general consideration of the motion of a rigid body. Suppose a point P in 
it is located, relative to OXYZ by the vector ry and relative to O'X’Y'Z’ 
by the vector r. Let the instantaneous position of the origin of O'X’Y'Z’ 
be measured relative to OXYZ by r’, where the prime here and in the re- 
mainder of this chapter never means differentiation. Then the absolute 
position of P is given by 


f 


rsr +r (9-26) 
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and its absolute velocity by 
vo =v tv" (9-27) 


where v’ = dr’ /dt measures the velocity of the origin of O'X ’Y’Z’ relative 
to OX YZ and v” is the velocity of the point in the moving system. Now 
suppose that the latter system is rotating with constant angular velocity of 
œ radians per second; then the point P has a linear velocity œ X r in 
addition to its translational velocity v relative to O’X’Y’Z’. Its com- 
ponents are v, = dr,/dt = s; vy = Fy} te =e Thus, 


vo=VtaxXrt+y (9-28) 


It is important to have a clear understanding of the separate terms in 
(28). The absolute velocity of the point is vo; v is the apparent velocity 
of P measured by an observer in the system O’X’Y’Z’ who does not know 
that his coordinate axes are rotating, while œ X r is the absolute velocity 
which the terminus of r must have in order to maintain its position in the 
moving body. The last velocity is often called the velocity of following. 
If the point P is rigidly attached to the moving system, v = 0; if the mov- 
ing system and the fixed system have coincident origins, v’ = 0. 

9.7. Motion of a Molecule.—In a molecule, we may consider the elec- 
trons and nuclei as bound together in a rigid framework which moves 
through space in translational motion and which rotates around its center 
of gravity. Both of these types of motion are included in the equations 
already given. One further motion is needed, however, for the nuclei 
execute oscillations around an equilibrium position. In order to allow for 
this vibrational motion, let r; be the instantaneous position vector of the 
i-th particle and a;, p; be the equilibrium and displacement vectors, respec- 
tively, so that 

r: = a; + pi (9-29) 
while 
ror = +r (9 -30) 


is the instantaneous position of the point relative to OX YZ as shown in 


Fig. 3 
Then from (28) 


Yoo = V + (@ X r) +V; (9-31) 
and 
2T = Emas: = v Lm: + Ema? + Lmi@ X r:) - @ X r) 
+ Qv’ -Emy + 2omaz;- (v X w) + 2 - Lm; X vi) 
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The reason for writing the last two terms of (82) as given comes from 
eq. (4-18) since 


velo Xr) =r: (v Xo); v- (@Xr)=a' (X7) 


Z 


Fic. 93 


Six further relations are needed to define the rotating coordinate system. 
These® may conveniently be taken as 
Rmi; =0 (9-33) 
ma; Xv = 0 (9-34) 


The first three of these equations locate the origin of O'X'Y'Z’ at the 
center of gravity of the system, for that point is given by 


and if = 0, Emr; = 0 and mv; = 0. The second condition, eq. (34), 
states that there is no angular momentum relative to O'X’Y’Z’, when all 
particles occupy their equilibrium positions, i.e., when every f; = a. 

Using (29), (83), and (34), eq. (82) becomes 


2T = vP Em + Lmv + Erlo X r) @ X r) + 20- Dmg: X Y) 
= (Ti + To + T, + Tint) (9-35) 


Inspection of (35) shows that the kinetic energy is a sum of four terms which 
may be interpreted in order as due to the translational motion of the mole- 
cule as a whole through space (T',); the vibrational motion of the nuclei 


ê See Eckart, Phys. Rev. 41, 552 (1935); Sayvetz, J. Chem. Phys. 7, 383 (1939). 


9.8 MECHANICS OF MOLECULES 292 


about an equilibrium position (T,); the rotation of the molecule as a rigid 
body about its center of gravity (T,); interaction between vibration and 
rotation (Tin). 

9.8. The Kinetic Energy of a Molecule.—It is necessary to obtain (85) 
in explicit form before further calculations can be made. As shown previ- 
ously T, becomes equal to (23), but it must be remembered that A, B, 
-+-+, F are instantaneous moments and products of inertia relative to the 
moving axes. They are not constants but functions of the position of the 
atoms and they change as the molecule vibrates. 

In discussing the terms 7, and Tiny it is convenient to use normal coordi- 
nates (see sec. 10.17). Suppose p; has components é,/ Vin, m/V mi, 
ti/ V mi where, 


& = DlinQe 
nm = Lime (9-36) 
G = DnwQs 


and lik, Mik Nix are constant coefficients such that 


L ` — 81 . y — l 
E leilrg = 3657; Lamm; = 3623; Denny; = 30,7 
k k 


Then, . 
cmv = LH +i +i) = EQ (9-37) 
Moreover, 
Lmi(p: X vale = Lind: — fee) = DAO: 
Lmi(p: X Vy = LG — bt) = DV (9-38) 
Elp: X ve = Lei — mi) = EZ 
where, 


Xr = Zo (rizr > Minii) 
4, 


Y, = 2 (lana — Purl Qi (9-39) 


Zi = Be (malin — limia) 


Collecting terms, (35) finally appears as 
27, = vP Em, 
2T, = EQ; 
2T, = Aw: + Buy + Co — 2Dowy — 2Eo,wy — Wop, 
QT int = 20X Qr + 20, YQ; + 2o E ZrQ (9-40) 
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9.9. The Hamiltonian Form of the Kinetic Energy.—In order to obtain 
the Hamiltonian form of (40), we must change from angular velocities to 
angular momenta. From (17) or (22) we see that the components of 
total angular momentum are 


oT 


P, = FI = Aw, — Dwy — Fo, + EXO: 
Wy 
oT A 

P, ~ Bw, = — De, + Bo, — Eo, + 3 YQ (9-41) 
Wy 
aT A 

"5,7 — Fws — Ewy + Co, + EZO 

Wy 


Similarly, the momenta conjugate to Q; are 
oT : . 
Pe = = Oe + Xros + Yrwy + Zewz (9-42) 
IQ 
Solving this equation for Òr and substituting in (41) gives 
Pz = Aw, — Dwy — Fu, + 2X: (De — X rwr _ Yrwy — Zi) (9-48) 


with similar expressions for P, and P,. The following abbreviations may 
be used to simplify the final results. 


A’ =A—ZX; D' = D+ EX2Y; 
B’=B-XYi; EB =E4+EY,Z (9-44) 
C=C-2Zi; Fo =F + 22% 
In terms of them, we may write 
Po = A'ws — D'wy — Fla, + EX apy 
P, =— D'o: + Bw, — E'w + LY uve (9-43a) 
P, = —F'ws — Eley + O'o: + Zep: 
If we also write 
Pe = Dupe; Py = LV ips; pe = LZupe (9-45) 
(48a) may be further simplified to read 
Pa = pa + Aw, — D'w, — F'ox 
Py = py — D'w, + B'w, — B'o (9-46) 
P, = pz — F'w, — E'w, + Cw, 
The quantities p,, Py, Pz arise from vibration alone as may be seen from 


their definition, eq. (45); they are called components of internal angular 
momentum, 
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Adding together all the terms of (40) and using (41), (42) and (45), 
we obtain 
2T = 2T; + (Pe — Pajos + (Py — Pydoy 
+ (P,- Dr) We + Lr (9-47) 


Finally, we find by solving (46) for the w’s that 
oO; = Luy; (i,j = 7, Y, z) 
3 


Bij = By; Pp = (Pi — D3) (9-48) 
With the use of these variables, eq. (47) takes the more elegant form 
QP = Ti + Cush P; + Ci (9-49) 
Explicitly the y’s are: 
B’C’ — EB” A'C! — F”? 
Mzz = A 5 Byy = Aa 
A'B' — D” C'D' + E'F' 
Maz = By So 
D'E' + BF’ | _ A‘E’ + D'F’ 
Maz = A 2 Hyz = A 
A’ -D -F 
=| -D B -E (9-50) 
-F -F c’ 


9.10. The Vibrational Energy of a Molecule.” —The first term in (47), 
the translational energy, is of little interest in physical problems. We 
shall have no more to say about it. The only other term of that equation 
which can be treated further by classical mechanics is the last one, corre- 
sponding to the vibrational energy of the molecule. We first consider the 
potential energy of the system due to the vibration of the particles. It will 
be some function of the mutual positions of the nuclei and it is most con- 
venient to specify these in terms of the mass-adjusted components of a 
displacement vector. We formerly took these as £,/ Vmi, n/V mi ta/ V mi 
(see eq. 36), 3n in number. Following convention we now use q1, q3 °°; 
qzn for the same coordinates. If the system is placed originally in the 
equilibrium configuration (all q; = 0) and if the particles have very small 


7 This section as well as secs. 9.11 and 9.12 makes use of some of the results of Chap: 
ter 10. It should be omitted or postponed by readers not familiar with orthogonal 
transformations. The authors suggest that the reader, rather than endeavor to under- 
stand normal coordinates by “elementary considerations,” acquaint himself with the 
_ more powerful methods of the next chapter. 
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mitial velocities, we assume that they will never depart to any large dis- 
tance from that configuration, nor will they ever acquire large velocities. 
Under these conditions, we may develop the potential energy V by Taylor’s 
theorem in terms of ascending powers of the qz. 


7 a? 
Paquet) = Vo + E(S a tig (Eest O81 
The constant term Vo which is independent of the q; can be omitted since 
it has no effect on the equations of motion of the system. The term linear 
in the q; must also vanish since dV /dq; = 0 is the condition for equilibrium. 
Finally if we omit all terms beyond the third, we obtain as an approxima- 
tion to the vibrational potential energy 


2V = Ebiti; (9-52) 
t2 


where b,; = (8° V /ðqiðq;). From (87) we have, in terms of the coordi- 
nates qe 
2T = Le (9-53) 


where T is now written for the former T,. 
If we now subject both T and V to an orthogonal transformation (see 
sec. 10.17), we obtain 


2T = LQ; 2V = ENQ? (9-54) 
where the normal coordinates Q, are related to the q’s by 
= Boork i (9-55) 


The constants 4, are the 3n eigenvalues found from the characteristic 
equation 


[Að — bij | = 0 (9-56) 


and a,, is the matrix formed from the eigenvectors. 
Knowing T and V we may obtain the motion of the molecule by solving 
Lagrange’s equations (8). They appear as 


dt On IQ ; = 4, a, %3 n) 


or . 
Qr = AQ (9-57) 


Three different possibilities arise: (a) Ay > 0; (b) Az = 0; (c) `e < 0. 
a. às > 0. The solutions of (57) are 


Qr = Ak cos (Vhat + ôr); (k = 1, 2, ory 3n) (9-58) 
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This is the equation of simple harmonic motion with two constants of inte- 
gration; Az is the amplitude and ôr, the phase constant. Eq. (55) now 
reads 

qi = LaiwAs cos (Viat + ôt) (9-59) 


Tf all of the A; are zero except one, say Aj, then all of the nuclei are acting 
as simple harmonic oscillators with a frequency of 


about their equilibrium position. Each nucleus has the same phase 
constant and reaches its equilibrium position at thesame time. The ampli- 
tudes will vary because of the factor æi Such a motion is called a normal 
mode of vibration. Actually the situation is much more complex, for many 
of the A, will be different from zero. Thus the motion of the nuclei con- 
sists of a superposition of all the normal modes of vibration, each with its 
own frequency V Ap /2e and amplitude. 

It frequently happens that some of the A, will be equal to each other 
in pairs or threes. This phenomenon, called double or triple degeneracy,’ 
means that two or three equivalent motions of the molecule have the same 
frequency and differ only with respect to their orientation in space. The 
phase factors and amplitudes must be evaluated from the initial positions 
and velocities of the n nuclei. We show in the next section how the normal 
modes and coordinates may be determined for a specific example. 

b. ày = 0. The solution of (57) is 


Qk = Axl + ôk 


hence the resulting motion is not a vibration. The nuclei will not oscillate 
about the equilibrium position but will continually move away. Since the 
whole treatment of the problem is based upon small oscillations from the 
equilibrium position, we are no longer justified in this case in omitting 
higher terms in the potential energy, and the method fails. Actually, it 
will be found that six of the \, vanish in the molecular problem (five if the 
equilibrium arrangement of the nuclei is linear). Three of these zero 
frequencies may be associated with translation of the molecule along three 
mutually perpendicular axes and the remaining three with rotation about 
the same axes. When it is desired, the zero frequencies may be removed 
from the problem before solving (56). This is done by reducing the 
number of coordinates from 8n to 3n — 6, the equations of conservation 

8 Wu, Ta-You, “ Vibrational Spectra and Structure of Polyatomie Molecules,” 
Second Edition, Edwards Brothers, Ine., Ann Arbor, 1946; Mathieu, Jean-Paul, “ Spec- 
tres de Vibration et Sym trie des Mol cules et des Cristaux,”’ Hermann et < ie, Paris, 


1945; Herzberg, G., “ Infrared and Raman Spectra of Polyatomie Molecules,” D. Van 
Nostrand Co., New York, 1945. 
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of linear and angular momentum (eqs. 33 and 34) being used for that 
purpose. 


c. Ap <0. The solution becomes imaginary and again does not corre- 
spond to a vibration. This case never occurs if the potential energy is a 
positive definite quadratic form (see sec. 10.12) which is always true in the 
molecular problem. 

9.11. Vibrations of a Linear Triatomic Molecule.—As an example of 
the preceding theory, we consider a linear symmetrical triatomic molecule 
XY. such as carbon dioxide. Let the central atom X have a mass ma 
and the two end particles have mass mı. Let the equilibrium positions 
be z? and z$ for Y and z3 for X. In order to simplify the problem, we arbi- 
trarily assume that the only motion which the nuclei can make is along the 
line adjoining them, hence the displaced positions are z; = 2? + éz;. If 
we now take the potential energy? as proportional to the square of the rela- 
tive displacements of the particles, in accordance with eq. (51) we have 


2V = kf (Sa, — dx)? + (özg — bxg)*} (9-60) 


and 
2T = m (6a? + 643) + mð? 


In terms of mass adjusted coordinates q; = V m,éa; 


rE a ey 92 g F 
Es Je | + Eas Pe | (9-61) 


Comparison with (52), shows us that 


ba = k/m; dig =bn = —k/W mime; big = bs, = 0 
bog = 2k/ma; beg = bas = —k/V mms; b33 = k/m 


When these values are substituted in (56) and the determinantal equation 
is solved we obtain 


A = k/m; Az = ku; A3 = 0 
B= (2m, + m)/ mmz (9-62) 
In order to find the coefficients a; of eq. (55) which relate the q; to the 
normal coordinates Q; it is necessary to find the transformation which 
reduces T and V simultaneously to a sum of squares (see sec. 10.17). 
According to sec. 10.15 the matrix effecting this transformation has as its 
columns the eigenvectors of the matrix [b;;], and these eigenvectors are the 


4 See Herzberg, loc. cit., for remarks concerning the choice of the potential energy 
expression in special cases. 
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solutions (x1,22,03) of the equations 
2 0sj%; = hz; 


corresponding to the three eigenvalues A; already found. Simple compu- 
tation yields for these eigenvectors 


k 

[— z3, 0, z3] a = — 

my 

[2s,~24]™ T3, za | A= kp 
Mo 


Mo 
T3, — T3, T3 a=0 
my 


They are already orthogonal; when z3 is also fixed by normalization, i.e., 
by equating the sum of the squares of the components of each vector to 
unity, they may be compounded to give 
—1/V2 1/V2um,  1/V um, 
Qik = 0 —2/V 2pme 1/V um, (9-63) 
1/2 1/V 2um, I/V umo 


We can now find the normal modes of vibration from (59). Taking 
Ae = hy, We See that the two end atoms move in opposite directions while 
the central atom is stationary. The other normal modes are found in the 


{Cnn cE 
Ay 


ee enn seo 


Àz A3 
Fig. 94 


same way. They are shown in Fig. 4. It will be observed that for the 
zero frequency?” As, the motion is translational, since tı = mj“ q = 
my 7013(Agt + 63) = (2m, + m)? (Aat + 83), and zs, z3 also equal 
this expression. 


i0 This frequency could have been removed from the problem by applying the con- 
dition my (621 + 523) + mdm = 0 to (60). 
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The treatment of this molecule is not complete because of the artificial 
assumption that the motion is only along the line of nuclei. For a com- 
plete treatment the reader is referred to Wu, Mathieu, or Herzberg 
(loc. cit.). 

9.12. Quantum Mechanical Hamiltonian.—Lack of space forbids the 
transcription of the results thus far obtained into the quantum mechanical 
language of Chapter 11. To provide a general view, however, we shall 
append here a few comments indicating the line of attack to be taken on the 
problem of the polyatomic molecule from the quantum point of view. The 
material of this section is not needed in other parts of this book. The 
expression for the classical kinetic energy found in (49) contains momenta 
Pala = x, y, z) defined in eq. (45) and p defined in eq. (42). Both of 
these are conjugate to the normal coordinates Qs. On the other hand the 
momenta Pa of (43) are not conjugate to Q,. In order to obtain a suitable 
expression for use in quantum mechanical calculations, all of the coordi- 
nates and momenta must be conjugate to each other. It is true that the 
P, which are functions of the angular velocities could be written in terms 
of some set of coordinates such as the Eulerian angles and then the Eulerian 
angles a, 8, y, the normal coordinates and the conjugate momenta Pa, Pg, Py 
Pa and py would be appropriate. The coordinates used in (49) may be 
retained, however, as shown by several authors. The correct quantum 
mechanical Hamiltonian?! is 


H = 2e (Pa — Patat (Po — Po) 
Ha Epe p + V (9-64) 


where a, b denote x, y, z and p is the determinant of uab (cf. eq. 50). This 
expression may be simplified by noting that P, commutes with p} and that 
the uaz are functions only of the Q}. We thus obtain 


H = 2 EaP Ph = E haPa + ASe patan P 


+ $5 P p p + F (9-65) 
where 
ha = TE 2HabPs + Pettus + Habt? (pou) (9-66) 


and pẹ does not commute with the y’s. 

For the sake of greater generality, we no longer need confine ourselves 
to the potential energy expression previously used but write the most 
general function consistent with the symmetry of the molecule 


Vi=VotVitVet-- 


11 See Wilson, E. B. and Howard, J. B., J. Chem. Phys. 4, 260 (1936) or Dennison, 
D. M., and Darling, B. T., Phys. Rev. 67, 128 (1940). 
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The first term, Vo, is identical with that given in (52) or (54); V, is homo- 
geneous in the third powers of the normal coordinates and their cross- 
products; Və is of the fourth power, etc. When the Hamiltonian is 
expanded, it is found that it can be divided into terms of different orders 
as follows: 


H = Ho +H +H. +: (9-67) 
The explicit form of Ho is 
Pz P Pal 2 
2H, = > + B, + Col + Epi + Vo (9-68) 


where Áo, Bo, Co are the equilibrium moments of inertia. It is seen that 
this represents the sum of the Hamiltonians of a rigid rotator and a har- 
monic oscillator; hence this part of H may be treated exactly by the 
methods of quantum mechanics as outlined in Chapter 11. 


Even to this order of approximation the details are tedious, for the 
vibrational part of the Hamiltonian for an n-atomic molecule involves the 
solution of a secular determinant like (56) with (3n — 6) rows and columns. 
Utilization of molecular symmetry, however, makes it possible to factor 
this determinant.'? Moreover, if the potential energy term, Vo, is ex- 
pressed in coordinates parallel or perpendicular to chemical bonds (so-called 
valence-bond coordinates), then vector and matrix methods, developed by 
Wilson and others,!* prove to be powerful tools for even quite complex 
molecules. 

Still further difficulties arise if higher terms in the Hamiltonian are 
included but. these are important if interactions between the rotational and 
vibrational energies are considered. Such interactions are often detectable 
experimentally and higher order rotational effects are also observed, 
especially in the microwave region.’* A suitable perturbation technique 
for such cases, which involves contact transformations, has been developed 
by Nielsen’? for the general n-atomic molecule and has been applied to 
many special molecules. 


12 See Chapter 15, or Herzberg, loc. cit. 

18 Wilson, E. B., Jr., Decius, J. C., and Cross, P. C., “ Molecular Vibrations. The 
Theory of Infrared and Raman Vibrational Spectra,” McGraw-Hill Book Co., Inc., 
New York, 1955. 

H Gordy, W., Smith, W. V., and Trambarulo, R. F., “ Microwave Spectroscopy,” 
John Wiley and Sons, New York, 1953. 

15 Nielsen, H. H., Revs. Mod. Phys. 23, 90 (1951). 


CHAPTER 10 
MATRICES AND MATRIX ALGEBRA 


In ordinary arithmetic, attention is focused upon single numbers. 
These numbers may be combined by various operations, such as addition, 
subtraction, multiplication and so on, to yield new numbers. In many 
branches of algebra, the student is forced to confer interest, not upon 
single numbers, but on collections of numbers (or functions). These col- 
lections can be simple sequences like a4, az, -* +, an, in which the order of 
the individuals may, or may not be of importance. A vector is an example 
of this kind. When such a sequence is written down, no understanding 
prevails that the numbers are to be combined in a certain way; it is the 
collection itself which matters. Meaning is imparted to the collection by 
specifying how it is to be combined with other collections. 

Besides simple sequences, collections of two-dimensional character are 
often objects of interest in mathematics, and recently in physics and 
chemistry. They may have a great variety of forms; they may be tri- 
angular, as 


ay 
by be 
C1 C2 cg 
or rectangular, as 
ai Q? a3 ün 
bı bz b bn 
€ 2 &3 en 


or quadratic, as 


by be b} by 
ey fg Cg C4 
dı dz d} d4 


Of these, the rectangular and quadratic ones are of greatest value. 
Without further specification, they are simply arrays, devoid of meaning. 
But when rules are laid down, stating how they may be combined to form 
new arrays, they become objects of mathematical importance, such as 


ant 
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determinanis and matrices. This is usually indicated by enclosing the array 
in bars or brackets of different form, bars being frequently used for determi- 
nants, brackets for matrices. It is also convenient to use a single letter 
for the individuals of a collection, and to distinguish the individuals of a 
simple or linear collection by single subscripts, those of a two-dimensional 
collection by two subscripts. 

10.1. Arrays.—A collection of real or complex quantities is called an 
array if it can be displayed in an orderly table of rows and columns. The 
individual members of the array are its elements. Each is equipped with a 
pair of indices, the first one referring to the row and the second one to the 
column in which the element is located. For example, the element Ap, 
will appear in the p-th row and the g-th column. If the number of rows n 
equals the number of columns, the array is said to be square (or quadratic) 
and of order n; if there are n rows and m columns (n = m), the array is 
rectangular and of order (n X m). 

10.2. Determinants.—The most familiar type of array is the determi- 
nant,’ which always has an equal number of rows and columns. It will be 
written in one of the forms: 


Ax A Ag -te Ain 
Aor Asoo Aos > Aan 
det A = | A | = A31 A32 A33 vo A3n 
Ani Anz Ans sas Ann 


The value of the determinant is obtained by the following procedure. 
First, a total of n! products is formed by taking one element from each row 
and column. Each product is then arranged so that the first subscripts 
of the elements are in their natural order 1, 2, ---, n. When this has 
been done, it will be found that the products may be separated into even 
and odd classes each containing n!/2 terns, as follows. In the even class, 
an even number of interchanges of the elements is required to bring the 
second subscripts into their natural order while in the odd class, an odd 
number of interchanges is needed. For example, A 242343; is in the even 
class while A;24 2,433 is in the odd class. If a plus sign is affixed to the 
even products and a minus sign to the odd ones, the algebraic sum of the 
n! terms, by definition, is the value of the determinant. We may thus 
write 

| A | = E(D Aor, oe Anr, (10-1) 


1 References will be found at the end of this chapter. The most complete accounts 
of determinants are those of Muir, T., “Theory of Determinants in the Iistorical Cirder 
of Development,” 4 vols., 1906-1923, and “Contributions to the History of Determi- 
nants, 1900--1920", Blackie and Son, Ltd., London, 1930. 
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where the summation is made over all permutations of ri To, +*+, Ta, and h 
is the number of interchanges required to restore the natural order. 

The following properties are direct consequences’ of this definition. In 
each statement, the word row may be replaced by the word column and 
the reverse. 

1. The value of a determinant vanishes, | A | = 0, when: 

a. All elements of a row are zero. 
b. All elements of one row are identical with, or multiples of, the 
corresponding elements of another row. 

2. The value of a determinant is unchanged, if: 

a. Rows and columns are interchanged. 
b. A linear combination of any number of rows is added to any one 


tii 
row; i.e., if Aj; is replaced by 2) Arn J = 1. 2.---, n, provided the cx are 
k=1 


fixed numbers. 
3. The value of a determinant changes sign if two rows are interchanged. 
4. If each element in any one row appears as the sum (or difference) of 
two or more quantities, the deterininant may be written as a sum (or differ- 
ence) of two or more determinants of the same order. Thus if the order 
is two 
Ay + Bu Are + Bie Ay, Aw By, Bis 
= + 
Az Aa An Az Aoi Age 


5. If all elements of a row are multiplied by a constant factor, the value 
of the determinant is multiplied by the same factor. 

10.3. Minors and Cofactors—The complementary minor of an ele- 
ment Ap is the determinant obtained by striking out the row and column 
in which Apg appears. The cofactor of Apg is (—1)?*? times its comple- 
mentary minor. It will be indicated by A”*. It follows from eq. (1) that 


[A | =£ AA” =P AA; (k =1,2, n) (10-2) 
i=l t=] 
However, 
E And” = $ AnA” = 0; GH (10-3) 
i=l i=ł 
for comparison with (2) shows that these equations are the expansion of a 
determinant whose k-th and j-th columns are identical with the k-th column 


of | A l, and according to property l-b of sec. 10.2, if two columns are 
identical, | A | = 0. Eg. (2), called the Laplace development, is commonly 


2 Details of the proofs may be found in texts on determinants (see references at 
end of chapter). 
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used for numerical evaluation of determinants, but if their order is larger 
than three or four, the number of terms and the labor involved is so great 
that other procedures are to be preferred. We describe one in sec. 13.27. 

10.4. Multiplication and Differentiation of Determinants.—If | A | 
and | B | are determinants of order n, the product | C | 


|aj|B| =|c| 


is a determinant of the same order. Its elements are given by one of the 
four equivalent (though not equal!) expressions 


Cy = ps AikBrj OF zx AB ye Or Z AuBes or Z ArBir (10-4) 
=} = = = 
The proof for determinants of order two follows. Using the first form 
of (4) we obtain 


AB + Aiz2B821 AiB + A12B22 


Cis 
| | | AgyBi; + AogBo: Aa: Bio + Á22B22 


but according to property 4 of sec. 10.2, the product may also be written 
AiB AnBie AyBy Anbe 
AaB AoBro AgoBar Ag2Boo 


ABa AizBo2 AiB A12B2e 
AaB An Bie Ao2B2, Ac2Boo 


Ic] = 


The first and last terms of this sum vanish, for if the constant factor A114 
is removed from the first determinant its first row is identical with its 
second row. Removal of the constant term A 4.42 from the last determi- 
nant leaves it with two identical columns. Constant factors may also be 
removed from the remaining determinants but they do not vanish. The 
result is 


By Bis Bo, Bog 
C| = AnA ÁÁ 
| | 11422 Bo, Boo + ÁA Bu Br 
Referring to property 3 of sec. 10.2 we see that this becomes 
By, B 
| C | = (AiA22 — At24e1) Be, Ba, 


Finally we note that (A114229 — A124211) is just the Laplace development 
of | A | so that we have shown the equivalence of the determinant | C | 
with the product | 4||B|. The proof with the other forms of (4) is 
similar. The method is also clearly applicable to determinants of higher 
order. 
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Oma (2), the partial derivative of a determinant with respect to an 
mit equals the cofactor: 

aja] 


= Ai® 
6A x 


1B. . Preliminary Remarks on Matrices.—If two or more arrays may 
rnboined in a certain way described in sec. 10.6, they are called ma- 
-` We indicate them by 

Ay Ár An -++ Aim 

Åz Áz Áa > Ám 

A = [Au] =] Asa As Ass +++ Asm 

Ani Anz Ang o Anm 
& cleterminants, matrices may be square or rectangular. Matrices 
tite order‘ will not be discussed here. When a matrix contains only 
yw Or Column, itis called a vector. Fora row vector, we will write 


[z] = [x1, £2, £3, °° +, En] (10-5a) 
ler to save space, we write a column vector as 
{x} = EZE Ta, 3, °° *, In} (10-5b) 
1geh its matrix form would be 
Ty 
Tz 
T3 
Ln, 


LIl letter u, v,---, z written without brace or bracket always means a 
tı weetor. Matrices with two or more rows or columns will be indi- 
bs-y capital letters. 

re elements of a square matrix A may be written and evaluated as a 
ninant. If|A| = 0, the corresponding matrix A is called singular. 
cle-terrninants do not exist for rectangular (non-quadratic) arrays, all 
gular matrices, by definition, are singular. Suppose we formed 
,inigants of all possible orders by taking successively 1, 2, ---, n rows 
rpluzmmns of A. If at least one determinant of order r does not vanish 


yr treatises on matrix theory, see references at end of this chapter. 
yr their properties, see Wintner, A., ‘‘ Spektraltheorie der unendlichen Matrizen,” 
el, XZ eipzig, 1929. 
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and all determinants of order greater than r do vanish, A is said to be of 
rank r. Thus if A is singular and of order n, r < n; if non-singular, 
r=. 


Problem. Show that the rank of the following matrix is two: 


1 1 1I 1 
2 2 3 —i 
0 0 1 -3 
3 3 5 -3 


10.6. Combination of Matrices—-Two matrices A and B are equal if 
and only if they are identical. If A = B, then Ap, = Bq for every p 
and q. 

The addition or subtraction of two matrices of order n gives a new 
matrix of the same order according to the following rule. If A+ B= C, 
then Cpe = Apa + Bp Addition and subtraction are both commutative 
and associative. 


A+tB=4B+A4A; (AtB)4tC=A+ (4820) 
Multiplication of a matrix by a scalar quantity « is defined by 
aA = afA;;] = [aA,;] = Aa 


Two matrices A and B may be multiplied together in the order AB 
only when the number of columns in A equals the number of rows in B. 
Under this condition, the matrices are said to be conformable. If A is of 
order (n X h), B of order (k X m), the product C is of order (n X m). 
Its elements are given by 


h 


Coq = AÅpsBsg; = 1,2, n; = 1,2, m 
Pg a Pp qa (p q ) (10-6) 


AB = [Ci] =C 


This rule for multiplying matrices is not as arbitrary as it might seem; it is 
suggested by the properties of linear transformations and the reason for 
defining it in this way will be given in sec. 10.10. We note at this point, 
however, that the law of matrix multiplication is identical with the first 
form of eq. (4) which defines the multiplication of determinants. Hence 
det (AB) = (det A) - (det B) if A, B are square, but det (A + B) x det A+ 
det B. In general, AB = BA, but when the order of multiplication is of no 
importance, so that AB = BA, the two matrices are said to commute or to 
be permutable. The ordinary laws regarding distribution and association 
apply. 


A(B + C)F = ABF + ACF; (AB)C = A(BC) = ABC 
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Provided A, B, x and y are properly conformable 


A{x} = {y}; [x]4 = [y] 
{xliy} = a scalar; {x}[y] = B (10-7) 


In the last case, B is a square matrix which has the same number of rows 
as {x} (or columns as [y]). Its rows (or columns) are proportional to each 
other. 

A given matrix may be divided into smaller matrices, the result being a 
partitioned matrix. For example, a square matrix of order three may be 
divided into four submatrices as shown. 

Ait Ai i Ais 
A= Ao, Ase : an — [a aio | 


: G2; G22 
Asi Age : Ags 


au = An A ao = Ais 
where u Aay Age É 12 Ao 
Go, = [ Az, A32}, Goo = 423 
If B is a similar matrix and is similarly partitioned, then each submatrix 
ai; and 6;; may be treated as a single element so that 
anbi + @i2b21 Gibi + aiabeo| 


AB = C= 
[ene + 29091 421012 + Go2bo2 


Finally, the elements of C are completely evaluated by the usual rules for 
matrix multiplication and addition. 
If A = [{A,,] is a square matrix of order m and B = [B,,| is a square 
matrix of order n, then the direct product 
AXB= [AnjBoa] 


is a square matrix of order mn. The index pairs (4,p) and (j,q) refer to 
the row and column, respectively. A suitable convention for arranging 
the rows and columns consists in taking these pairs in such a way that 
(jq) precedes (j’,q’) if 7 <j, q <q orifj =j’ g < g (dictionary order). 
If A, C are of order m and B, F of order n then 

(A X B)(C X F) = AC X BF 
is a matrix of order mn. The direct. product of matrices has of course 


nothing to do with the cross product of vectors, for which the same symbol, 
X, is used. 


Problem a. Prove eq. (7). 

Problem b. Prove that (A X B)(C X F) = AC x BF. 

10.7. Special Matrices.-When all the elements of a matrix are zero, 
the matrix is called null and indicated by O. For any matrix A, 


O+4A=4; 0A =AO=0 
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It should not be inferred, however, that the vanishing of a matrix product 
implies that either or both of the matrices multiplied together are the null 
matrix (cf. Problem, sec. 10.7). 

The unit matriz E has unity for elements along the “ main ” diagonal.° 
All other elements are zero. The matrix elements are conveniently 
symbolized by the Kronecker delta (cf. sec. 3.4) 


a = 0s pHa 

ma ll; p=q 
For every matrix, 

EA = AE= 4A 


If all matrix elements vanish except diagonal ones, the matrix is called 
diagonal. The general element of a diagonal matrix is thus of the form 
Dig. Ali diagonal matrices commute with each other, for if D and D’ 
are diagonal 


(DD')ix = E DibisDidix = D:Diðm = (D'D) x 
i 


If a matrix A commutes with a diagonal one, D, the elements A,;; will all 
vanish, except those for which the diagonal matrix has equal elements, 
D: = D;. The proof isas follows. Assume that AD = DA and, of course, 
Di; = Didije Then 
2 AirDrðki = 2 Didip Ary 
and 
AijDj = DiAij; Aig(Di-— D;) = 0 
Hence, either D; = D; or Az; = 0. 
If all of the diagonal elements of D are different, A must he truly diagonal 
with all different elements, say A1, 42,---,4,. Itis sometimes convenient 
to write such a matrix in the form 
Å = diag Ci, Aa, tes An) 
If some of the diagonal elements of D are repeated so that its form is 
D = diag (Dı, Dy, Do, Ds, Da, -) 
then the commuting matrix A will have the form 
A= diag (ay, Qa -) 
where the square matrices a; are arranged in symmetrie positions about the 
main diagonal and the other elements of A are zero. The forms of the 
submatrices will be 
4 A slg; 
Ay A Agg (se Ags 
a= | H | @ = | Aass olaa Ads 
Áz AÁ 
A33 Asa Ass 


5 The main or principal diagonal is that running from the upper left to the lower 
right of the array. 
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The sum of the diagonal elements of a square matrix is called the trace 
(German “ Spur ’’). 


TrA = 2 Ag 
t=1 


The trace of the product of two or more matrices is independent of the 
order of multiplication. The proof is simple. 
Tr AB = DL (AB) ix = DLAs; B;: = Tr BA 
t iJ 
HAXB =C, TrC=TrA-TrB. 

The transposed matriz to A, indicated by A = [A,;] is formed from A 
by interchanging rows and columns. If A and B of (6) are transposed, A 
becomes of order (h X n) and B of order (m X h). They may be multi- 
plied together only in the order BA and the product Č is of order (m X n). 
Thus when a matrix product is transposed, the sequence of the matrices 
forming the product must be reversed. This holds true for any number of 
factors 

F=ABCD.--X; F=X.--DCBA 


The matrix A = [A ‘] is the adjoint matrix. Note that the adjoint is 
formed by first finding the cofactor A?’ of the element Apa in | A | and 
then transposing the resulting matrix. From the properties of determi- 
nants, it follows that 


p 


AA=AA=|A|E (10-8) 
hence if A is singular 


AA = 4A =0 (10-9) 


However, the adjoint matrix exists even when A is singular. 

When A is a non-singular square matrix, we may divide A by | A | to 
obtain a matrix A~’ which is the reciprocal of A. Only square matrices 
have reciprocals. 


At = ET >. AAT = AAS E (10-10) 


Suppose the matrices of (6) are square and non-singular. Multiply both 
sides of the equation by B-'A™! and then by C7 in the order shown: 


BOA ABC = BA CC! 


Thus C7? = BA. Reciprocation of a matrix product requires reversal 
of the order of the factors as in the case of the transposed matrix product. 
The rule holds for any number of factors. 

If the elements of A are complex numbers, the complex conjugate of A is 


ê This name seems to be in agreement with the usual mathematical convention. 
Writers on quantum mechanics frequently call that matrix adjoint which we later call 
associate. The reader should take care not to be confused by this situation. 
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defined as A* = [Až]. Unlike the preceding case, if F = ABC.» X, 
F* = A*B*C*... X*, 

The matrix formed by taking the complex conjugate of all the ele- 
ments and then transposing the matrix is called the associate matriz,? 
At = (4%) = (4)* [f F = ABC; Ft = OBA. 

At this point we nave defined four important operations on a matrix A. 
These result in — A, A, A and A*. It is important to note that each of 
these operations has the reflexive property, so that when the operation is 
performed twice, the original matrix is reproduced: 

Paved 
—(—A) =A; (A) =4; (A7)7 =A; (Y =A 

By combining these operations in all possible ways, the following 16 matri- 
ces may be derived from A: +A, +4, +A, + A*, + (A), + (d*)7, 
+At, + (4A). In certain cases, A may be identical with some other 
member of this set. Such matrices have been given special names. We 
shall have occasion to discuss the properties of most of them later, but 
for convenience we list them now in Table 1. We will have no need of 
the types: A = AT? (cnvolutary) and A = (A*)7?. 


TABLE 1 

Relation Name of A Matrix Elements 
A= A symmetric Ape = Agp 
A= -À skew symmetric App = 0; Apg = —Ágp 
Az 47} orthogonal cf. eq. (42) 
A= A* real Ap = Aa 
A= —A* pure imaginary Apq = iBpa; Bog real 
A= At Hermitian Apo = AX 
A= -At skew Hermitian App = 9; Ang = — A 
A= (Aty7} unitary ef. eq. (50) 


Note that a real symmetric matrix is a special case of an Hermitian 
matrix. Suppose H = A + iB is Hermitian with both A and B real; then 
Ht = A — iB; but by definition H = Ht. Thus the real part is sym- 
metric and the imaginary part skew symmetric; in other words, a real 
Hermitian matrix is also symmetric. Similarly, a real orthogonal matriz is 
unitary, for if U = A+ iB is unitary then by definition U = (Uy, 
UtU=E and (A—iB)(A+7B)=E. If B= B= 0, then AÅ =E 
which defines the orthogonal matrix. However, a complex symmetric 
matrix is not Hermitian nor is a complex orthogonal matrix unitary. 

Problem. Show that AB = O but BA = O where 


"—-§ —4 —2 0 1 -2 
A={-9 -6 -3]; B=|-1 0 3 
3 2 i 2 -3 0 


T This is the matrix called adjoint by writers on quantum mechanics. It is also 
called the Hermitian conjugate. 
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10.8. Real Linear Vector Space. Let us consider a space of two dimen- 
sions, that is, a plane in ordinary three-dimeusional space. A veelor in 
this space, as we have shown in see. LL, is completely deseribed by its two 
eompeonents or by the eoordinates of its origin and terminus. Tt is also 
deseribed by the matrix x of one column, or its transposed, the row veetor 
xf oX, the two real numbers whieh are its components being the two 
matris elements, After we have chosen one vector it is possible to find 
another vector y in the same plane which is not a multiple of x. In fact, 
yos completely independent of x. But no matter how we draw a third 
veetor Z, it may always be represented as 

ax by -z 


where a and b are numbers. ‘There is nothing unique about x and y, the 
point being that two and ouly two vectors are linearly imdependent in two 
dimensions ari a third veetor ais linearly dependent on the other two. The 
sittuetiog may further be characterized as follows. If two vectors are 
linearly indeperndeut, no relation 

ax + by- 0 


ean exist unlessa b 0, for as we have seen a linear combination of two 
vectors gives a new veetor. For the purposes of this chapter, we shall 
weed more than two or three dimensions, benee we shall speak of a space of 
na -dimen-tons, Where n isan integer, When n is greater than three, itis, of 
course, iuposstble to visaahze the situation, but the geometric concepts of 
ordinary space will be used wherever convenient. Thuas an nedimensional 
coordinate svetem will consist of a mutually perpendicular axes, a point 
will require a coordinates for ris loeation and a veetor will be deseribed 
by menas ef ats components or by the eoordinates: of its origin and 
terminus 

Suppose the components of a veetor im such a space are real nnmbers 
Pita da. then we may write the veelor x nsa matris of either a single 
row ora shigle column as in (Sa) or (5b). 

The scalar product of two vectors” is a sealar 

ïy Tiye dotaye poco f Palle (IO 11) 

Phe syuare of the length of a veetor is detinet asin see. Li 


S e Soon dab doce bat (10 12) 
‘Phe vector product, usually denoted by y X x, is more differ! to formin- 
late by matrix methods. ‘Po obtain tt, we first, construct from y the skew- 


symmetric matrix 
0 Wa We 


Yo. } us O 
my A o 
For defimtoness, wo suppose that x and y are both column veetars, Noto that 
xy o the equivalent, in matrix notation, of x+y in vector notation. 
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In terms of this, 


yxXx= ¥x 
The vectors ty, Us, -- +, Un are linearly independent if there exists no set of 
scalar quantities cy, co, -+-, €n not all zero such that 
cy Uy + Cgikg +o Cnn = 0 (19-13) 


The simplest way of testing vectors for linear independence is to evaluate 
the Gram determinant (see sec. 3.13) 


| üu; Üu see a ,u, | 
ir|= | üu, üU --- Üzün 
a | 
| ü, Ün Us a fi,U, , 
If | r| vanishes, the vectors are linearly dependent; if | T | > 0, linearly 


independent. 

When n linearly independent vectors have been chosen they form an 
n-dimensional coordinate system or basis, being equivalent to a set of n 
coordinate axes. Any other vector v may then be expressed as a linear 
combination of the chosen vectors U1, uz, +-+, Un, the linear combination 
being unique. It should be emphasized that there is nothing unique about 
the choice of the basis, for any n linearly independent vectors are suitable 
for that purpose although the most convenient choice, in general, is a set 
of unit vectors. The latter are defined by the relations® 


e; = {1,0,0,0,- - -,0} 
e> = {0,1,0,0,- - -,0} 
e; = {0,0,1,0,- - -,0} 


or similarly as row vectors. Clearly they are of unit length and mutually 
perpendicular, for 


če; = Ôij (10-14) 
In terms of the unit vectors, any vector x may be written 
X = 7,0; -+ Meg + ``- F Enên (10-15) 


If the origin of x is taken as coincident with the origin of the basis formed 
by the e;, the components of x are the coordinates of the terminus of x. 

It is often necessary to use a particular set of linearly independent 
vectors as a basis, constructing from them a set whose members are mutu- 
ally perpendicular and of unit length. This procedure, known as Schmidt's 
orthogonalization method, is effected in the following way. Suppose the n 
given vectors are wu), Us, ---, Up. Select any one of them, say 
u, and let vj = u,, e; = v/h, where J, is the length of vı. Now 


Brae? : : 
Notice that the subscripts on the vectors er do not designate components. 
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take vo = uz — ¢2,€;, choosing cz; so that €,v2 = &ug — cy&e, = 0 
which requires co, = iUs or Vo = Uy — (E;ug)e;. If we put es = vo/le, 
where l is the length of vo, we shall have čes = ôo. Next let 
Vz = Ug — €311 — C322, determining the constants so that é,v3 = 8,ug — 
c31 = 0 and v3 = 8:03 — c32 = 0, which means that c3) = 8,u3 and 
c32 = ug. Finally let e, = v3/l3. Continuing in this way, we may 
construct the complete set of n unit vectors with 
n 
Enyi = Tatl, Vegi = Unti ~ 2o (EkUnpi)Eek 
baat k=l 
Problema. Consider the columns of the matrix of Problem, see. 10.5, as the com- 
ponents of four vectors. Test them for linear dependence. 
Problem b. Prove that Y° = — [y]{y} FY. This relation is known as the “Cayley 
identity.” 


10.9. Linear Equations.—Matrix methods are useful in solving and 
discussing linear equations of the form 


Ait, + Art +--+ Aintn = Y 
Ant, + Ante H: + AonTn = Y2 
Aniti -+ A note +o + Anntn = Yn (10-16) 


10 


which are inhomogeneous. They may also be written as 


AX =y 
The corresponding homogeneous equation is 
Ax= 0 
The matrix A and the vector y are to be considered as known while the 
n components of x are unknown. The questions of chief interest concern 
the number of possible solutions and the method of finding them. Several 
cases arise depending on the rank of A, but for our purposes we consider ™ 
only three possibilities. 
a. y #0;|A| 0. According to (10), 47? exists, hence 
Ay 
x = Ay = = 
re ya] 
is the unique solution. From the definition of A and the rule for matrix 
multiplication, it also follows that 


1 . . . 
T; = Ja] (yAl* + yA” + vs + YnA™) 


which is commonly known as Cramer’s rule.!* 


1° See sec. 2.5 for the meaning of the term homogeneous. 

1l The others are discussed by Bécher, M., “Introduction to Higher Algebra,” 
Macmillan Co., New York, 1907. 

1? In actual calculations, it is usually simpler to solve (16) by direct elimination 
(cf. sec. 13.26), 
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b. y = 0; jal = 0. The only solutions are the trivial ones zı = 
Zg =- =a, = 0. 

e. y = 0; | A | = 0; A% = 0 for at least one value of i and k. If we 
knew the value of one of the unknowns we could find the values of the 
remaining (n — 1) unknowns, since we could then form from the original 
set (n — 1) inhomogeneous equations with non-vanishing determinant. 
In other words, we are confronted in case (c) with n unknowns but only 
(n — 1) equations. However, we note that the k-th row of our set of 
n equations is 

Arti + Arat + te + Anntn = 0 (10-17) 
so that if we take 
zi = cA” (10-18) 


where c is any constant, it follows from (8) that (17) is satisfied. Even if 
j = k, we still have a solution, for (17) is then identical with (2) but 
| A| =0. We thus have an infinite number of solutions of the homo- 
geneous equation when | A | = 0 as j may take any value from 1 to n and 
c is completely arbitrary. Of course, some of the solutions (18) may be 
worthless, since several of the cofactors may vanish; but it will be found 
that there are always enough non-vanishing ones so that the ratio of all the 
unknowns is determined.!? The fact that the set of homogeneous equa- 
tions Ax = O possesses non-trivial solutions only when | A | = Ois of great 
importance in many problems and will often be used in the next chapter. 


Problem a. Sometimes chemical analysis must be done in an indirect way. Solve 
the following problem by means of determinants. A mixture of sodium chloride, 
sodium bromide, and sodium iodide weighed 0.5000 gram. Upon the addition of silver 
nitrate, the mixed silver halides weighed 1.0369 g. The iodine in the mixture was pre- 
cipitated as palladous iodide, which weighed 0.3006 g. Find the composition of the 
origina! mixture. 

Ans. NaCl, 50%; NaBr, 25%, Nal, 25%. 

Problem b. Solve the set of linear equations which result from application of 
Kirchhoff’s laws to the network known as a Wheatstone bridge, out of balance, and obtain 
the current through the galvanometer. Label resistances Ry, Re, Rg, Ra, going clock- 
wise around the “diamond.” Let galvanometer resistance be Rp. 


(Roky — Riks)# 


Ans. ty = 
?  RiRaRa + RoRaRy + RgRaRy + Rah Ro + Rg(Ri + Re) (Rs + Ra). 


where Æ is the external electromotive force. 


10.10. Linear Transformations.—Consider a set of linear, inhomo- 
geneous equations, similar to (16), relating m quantities x and h quantities 
x’, which can be written in the form 

? s= l, 2 
T = DB ata: q=1,2 


7 


Py h; 


Mm 


(10-19) 


18 The proof is given by Bôcher, p. 4. 
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These equations define a linear transformation, which may be interpreted 
in two different ways, as we shall see later. 
Furthermore, let n quantities x” be related to x’: 


tp = Epam; p=l,2,--4n 


Then, on combining these equations, we find 
ft 
ty = LApsBaghq 
2,9 


The same equations could also be written as a single sum 
Lp = E Cpt (10-20) 
q 


provided that we agree to take Cpa as in eq. (6) which defines the law of 
matrix multiplication. 

The importance of linear transformations thus indicates that this 
definition of matrix multiplication would be a useful one. It could, of 
course, be defined in other ways, and one possibility is 


(AB)ir = AijBkj 


where, for convenience, the summation sign is omitted and summation 
over the repeated index, as in tensor analysis, is required. The consequence 
of this definition is interesting, for if we multiply by a third matrix C, we 
find 

(AB - C)ir = (AB) ie; = AiB jEr 
but 

(A - BC) = Åis(BO)ks = AisBkjCsj 


and the matrix elements are not the same in the two cases. Therefore, 
the associative law of multiplication would not hold. 

A linear transformation is frequently interpreted as a rotation of 
coordinate axes. In sec. 4.1, we have shown how direction cosines may 
be used to relate the components of a vector in two different coordinate 
systems. Let us now consider these relations in matrix notation. Suppose 
that two systems OX YZ and OX’Y’Z’, or bases with unit vectors e and e’, 
coincide originally and that OX’Y’Z’ is then rotated in the positive 
(counterclockwise) direction through the angle ¢ about the OZ-axis. A 
vector x’ in the new system is related to the same vector x in the original 
system by an equation similar to (19), which in matrix form would read 


x = B(¢)x (10-19a) 


Rotate the system again, this time through the angle 9, to obtain OX” Y” Z” 
and the vector becomes x”, where x” = A(8)x’. We could then write 


? 


x” = Ay’ = ABx = Cx (10-202) 
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from which we see that C = AB is a matrix which transforms x directly 
to x”, without passing through the intermediate system in which it was 
called x’. We also see from (20a) that, if physical significance is to be 
attached to the matrices A and B, the order of performing the operations 
involved must be preserved, for in general AB + BA although the matrices 
do commute, in the case now being discussed. As suggested by this 
example, we always understand that the order ** is from right to left, B first 
and then A in the case of eq. (20a). 
Provided C is not singular, we may find C~}, hence 


Cx’ =C'Cx=x (10-21) 


We may thus use (20).or (21) to determine the components of the same 
vector in either of two different coordinate systems. 

Sometimes one wishes to think of linear transformations in another way. 
Suppose that there is only one coordinate system and that a vector x is 
rotated through the angle ¢ in the positive direction to give a new vector y 
in the same coordinate system. If the reader will think about the situation 
for a moment (or draw a simple figure, if necessary) he will be convinced 
that the operation is equivalent to rotation of the coordinate system 
through the angle ¢ in the negative (clockwise) direction. The matrix 
elements will therefore not be identical with those used when we assume two 
different coordinate systems. They are, however, closely related as will 
be shown in sec. 10.17. 

10.11. Equivalent Matrices.—Let P and Q be non-singular matrices. 
Then A and B are said to be equivalent when 


B = PAQ (10-22) 


Equivalent matrices have many properties in common as the subsequent 
discussion will show; their importance is due to the fact that it is often 
possible by means of a linear transformation like (22) to find an equivalent 
matrix which has simpler properties than the original one. When the 
equivalent matrix is in its simplest form, usually diagonal, it is said to be 
canonical. The problem of finding an equivalent matrix of canonical 
form is analogous to that of finding a suitable coordinate system in ordinary 
scalar algebra (cf. Chapter 5). 

Several special cases of equivalent matrices are possible, depending on 
the nature of the matrices P and Q effecting the transformation. 


M The opposite convention is often used and endless confusion may result if this 
fact is overlooked when the equations of various writers are compared. The elements 
of the matrix produets, of course, must always be evaluated from left to right, as re- 
quired by eq. (6). 

4 Canonical matrices are exhaustively discussed by Turnbull, H. W., and Aitken, 
A, C., “ The Theory of Canonical Matrices,’’ Blackie and Son, London, 1932. 
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a. If PQ = E, then i 
B = 0 7'AQ (10 23) 
The transformation is called collineatory or a similarity transformation 
(Ahnlichkeitstransformation). The two matrices A and B are said to be 
the transforms of each other. 
b. If P = Q, the transformation is called congruent: 
B = ÕAQ G0 24) 
c. If P = Qt, then 
B = Q'4Q (10 25) 
the transformation is conjunctive. If the matrices are all real, this becomes 
identical with (24). 
d. If PQ = E, P = O (ie., Q is orthogonal) and all the matrix clements 
are real, then 
B= 0AQ = QAQ 10. 26) 
represents a real orthogonal transformation. It is both collineatory and 
congruent. 
e. If the matrix elements are complex and PQ = E, P= Qt= Q} 


(.e., Q is unitary), then 
B= Q'AQ = O'AO (10. 27) 
is called a unitary transformation. It is collineatory and conjunctive. 
10.12. Bilinear and Quadratic Forms.—A homogencous polynomial of 


the second degree in 2n variables x1, £2, <- +n} Yi; Y2 °° *) Yn is called a 
bilinear form. It may be abbreviated as 


A(xy) = Ay = LAr (10 28a) 
ad 


where A = [A;n]. If both x and y undergo non-singular transformations 
x= Px’; y= Qy’ 
then 
A(xy) = £’PAQy’ = ŻE By = A’(x' y’) (10 29) 
If P = Q', so that x = Quix’; y = Qy’ then x and y are called contra- 
gradieni variables since they undergo opposite transformations. 

As a special case of a bilinear form suppose x = y. Then the coefli- 
cient of ziz; (i = j) in (29) is (Ag; + Aji), and the matrix A becomes 
symmetric if we write (Ai; + A;;)/2 for every Ag and Aj. Ey. (28a) 
may then be written 


A(xx) = LAr; = ŽA; A= A (10 28h) 
3,7 


Such a function is a quadratic form; if it is positive for all real values of 
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the variables, it is called a posttive definite quadratic form; if it is positive 
or zero, it is called semi-definite. 

10.13. Similarity Transformations.—Suppose the same vector is called x 
when referred to the basis e and x’ when referred to another basis e’. 
Let another vector be y or y’, where 


x= Qx; y= Qy (10-30) 


Such variables which undergo the same transformation are said to be 
cogredient. Now consider the transformation 


x = Ay (10-31) 
which changes y into x in the basis e. Then, 
Qx’ = Ay = AQy’ 
or, if Q is non-singular 
x’ = Q-'AQy’ = By’ (10-32) 


Hence, (32) is a transformation which changes y’ into x’ in the basis e’ 
while A, the transform of B, performs the corresponding transformation 
from y to x in the basis e. This is the reason for the name similarity 
transformation. 

An alternative interpretation of the transform may be given. Let . 
x, x’, y, y’ be four different vectors all in the same basis. Then (30) 
changes x’ into x and y’ into y while (31) changes y into x. The single 
transformation that changes y’ directly into x’ is (32), since x’ = Q-'x = 
QA = Q7AQy’. Hereasinsec. 10.10, the form of the matrix equations 
is similar for different vectors in the same hasis or the same vectors in 
different reference frames. The matrix elements, however, will not be 
identical in the two cases. 

10.14. The Characteristic Equation of a Matrix.—If \ is a scalar pa- 
rameter, A is a square matrix of order n and E the unit matrix of the 
same order, the matrix 


K = [ME — A] (10-33) 
is called the characteristic matrix of A. The equation 
KQ) =|K|=|)E-A|=0 
or its equivalent 
KA) =A” + aA + aA +--+ +a, = 0 (10-34) 
where the a, are functions of the elements of A, is the characteristic equation 


of A. Then roots of KA), Ax, Ag, As, «++; An, not necessarily all different, 
are the characteristic (or latent) roots. On writing (34) in the form 


(A ~ AL) (A — Ag) (A = Ag) ++ A An) = 0 
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and comparing coefficients of the different powers of it will be seen that 
Apt Ag too tA = mh 
Ade + MAg +++ + Anada = de 
AAgdAg t+ + Anonn = — ag 


ApA2dg + An = (—1)"@, 


If B=Q7AQ, then [ME — B] = [AE — QAQ} = OT AE — AIO. 
Moreover, 
JAB -B| =|@\|pe—allal=|ze—A4l (085) 


Hence two matrices related by a similarity transformation have the same 
characteristic roots. 
We leave the proof of the following statements to the reader. 
Tr QAQ = TrA 
[ag] = |A | 


` 


If C = A X B, the characteristic roots of C are the products, taken in 
pairs, of the roots of A and B. 


Problem. Prove the statements of the preceding paragraph. 


10.15. Reduction of a Matrix to Diagonal Form.—Consider the linear 
transformation 
Ax = Xx (10-36) 
The only effect of the matrix A on the vector x is to multiply it by the 
constant scalar factor A. Rewriting (36) in the form 


DE — Ak = Kx = O (10-37) 


we see that, except for the trivial case where all the components of x are 
zero, | K | must vanish (ef. sec. 10.9b, c). Hence, as shown in the previ- 
ous section, A can only take the values Ay, Az, © ++, A, Where A; is one of the 
characteristic roots of K(\). These quantities are the eigenvalues of the 
matrix A; the accompanying sets of vectors x are the eigenvectors. Eq. (36) 
is the matrix form of an eigenvalue equation, other examples of which were 
discussed in Chapter 8. 

Now suppose B is a diagonal matrix; then the roots of its characteristic 
equation are identical with its diagonal elements. If A is not a diagonal 
matrix but is related to B by a similarity transformation, B = Q7'AQ, then 
it follows from (35) that its characteristic roots and equation are the same 
as those of B. The problem of reducing A to diagonal form by means of a 
similarity transformation is thus closely related to the problem of finding 
its eigenvalues. We shall now show how such a reduction may be made. 

The eigenvalues themselves must first be obtained '® by solving (84). 


18 Numerical methods of finding them arc discussed in see. 13.28. 
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Having found the A; we wish to determine a matrix X such that 
XTAX = A = [Aba] (10-38) 


We distinguish the following two cases. 

a. The Eigenvalues are all Different, Let us consider the case in which 
all eigenvalues of A are different. Select one, say Ag, and form the n linear 
equations 

AX = Nyx (10-39) 


They are homogeneous, but as shown in sec. 10.9, we may solve them for 
the ratio of the components of the eigenvector x, Remembering that 
each component contains an arbitrary constant, we write them as a column 
vector 

Xk = (TikTak Enk] 


The remaining eigenvectors are determined in the same way using each 
eigenvalue in turn. Finally we form a matrix ¥ whose columns are the 
eigenvectors of A. This matrix clearly satisfies the equation 


AX = Xið] 


When this result is multiplied by X7', eq. (88) is obtained. We have thus 
shown that the matrix X which diagonalizes A, may be found by com- 
pounding the eigenvectors of A into a matrix. The reduction to diagonal 
form here described is unique except for the order in which the eigenvalues 
occur along the diagonal. 

Although not required by the method, the eigenvectors may be orthog- 
onalized and normalized by the Schmidt process, which fixes the unde- 
termined constant appearing in the solution of (39). We return to this 
question in sec. 10.17. 

b. The Eigenvalues are Not all Different. When two or more of the 
eigenvalues of A are equal to each other, reduction to true diagonal form 
is not always possible. Suppose A, is an eigenvalue of A repeated r; times. 
Proceeding as before, we find an eigenvector x; so that 


AX, = ÀX] 


Then if x, is the first column of a square matrix X, the first column of AX 
will be Ax, and the first column of X'AX will consist of A, followed by 
(n — 1) zeros. Call this matrix B: 


Here B; is a row matrix with (n — 1) elements and B,; is square of order 
(n — 1). Since B is the transform of A, it also has the eigenvalue M 
repeated r; times, but B;; contains that eigenvalue only (rı — 1) times. 
This matrix is subjected to the same procedure as A: we find an eigenvector 
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and form a new matrix Y whose first column is that eigenvector. Note 
however that Y has only (n — 1) rows and columns and 


so that the matrix 


will transform B into the form 


0.0 Cy 


Continued applications of similar transformations will eventually result 
in a single matrix Z such that 


i A F 
Z a| (10-40) 
where 
[o Hi Hi ee Min 
O M Ha oo Han 
0 0 m e Ha 


A, = (10-41) 


0 0 : 
: | 


The matrix F is rectangular with rı rows and (n — rı) columns while G is 
square and of order (n — rı). Now if Z, isa rectangular matrix composed 
from the first rı columns of Z, then we may remove the unwanted matrix F 
from (40), for 

Zy AZ = Ay 


The next step is to treat the matrix G in a similar way until it is reduced 
to the form of 4, with its eigenvalue A» along the diagonal. We then con- 
tinue with each remaining matrix until every eigenvalue has been used. 
Finally if we join together all of the rectangular matrices Z; to form a 
square matrix W, we will have 


WAW = diag (Aj,A2,- --,4r) 


where r is the number of distinct eigenvalues of A. Note that we are using 
the notation of sec. 10.7 to denote a diagonal matrix, but in this case the 
diagonal elements A; are really matrices themselves. Each is of the tri- 
angular form of (41). 

In the general case, it is possible to make a further transformation so 
that the eigenvalues occur along the diagonal of A; while unity appears in 
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each position immediately above the eigenvalue and zero elsewhere. *! 
In the special cases where A is symmetric, Hermitian, or unitary, the non- 
diagonal elements of the triangular matrices may be completely removed 
so that the final form is truly diagonal. We consider these cases in secs, 


10.17, 10.19, 10.20. 


Problem. Reduce to diagonal form 


8&8 ~-8 —2 
4 ~3 —214+ Ans. Ay = l; Ae = 2; Ag = 3. 


3 —4 1 

10.16. Congruent Transformations—When a change of variable, 

x = Qy is applied to a quadratic form, (28b) becomes 
A(x,x) = £Ax = FOAQY = FBy 

Thus the transformation of the matrix A which corresponds to this change 
of variable is congruent. Its importance is due to the fact that by its 
use a quadratic form may be reduced to a sum of squared terms, as will 
now be shown. Provided B is diagonal, ẸBy will be a sum of squares. 
Hence our problem is that of diagonalizing the symmetric matrix A by 
means of a congruent transformation. Suppose Ay, in A is not equal to 
zero. Then A may be written as 


_ Ay y 
a= É ya 


where A” is the matrix obtained from A by striking out the first row and 
frst column and V = [Aio Aig Ain] Now let 


2|} -Vu 
Q: = E En | 


where £,,_; is the unit matrix of order (n. — 1). Then 


A _ Ady 0 
QAQ: = [i W 


and A” is a matrix of order (n — 1) whose elements are 
Aint Anj 
Ay 
and for Which the last row and column are designated (ni — V)--noet n. 
T he matrix A may be treated in the same way and the process continued 
until A is completely reduced to diagonal form: 
QAQ = diag (aay ` *,Qn) 


lon, - . 
toe N he proof requires the theory of elementary divisors; sec Turnbull and Aitken, 
re. cit. 


Bi; = Astisgti _ 
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The final result is 
HAx = EDE = anti + aot} +-+- + ank? 


with . 
D = QAQ = [aby] 
Q = 0:0:03--- On 
and 
x= QE 
The matrix Q will have the form 
1 ~Aie/Ay —Ais/Ain +++) ~Ain/Ars 
0 1 0 cae 0 
Q=]|0 0 1 0 x 
0 0 0 1 
1 0 0 6) 1 0 0 
0 1 —By2/Biy — Biyrai/Bir 0 1 0 0 
0 0 1 0 0 0 1 0j] = 
0 0 0 1 0 0 0 1 
1 Qio O13 Qin 
0 1 Qos Qon 
0 0 i Qan 
0 0 0 1 


The determinants, Am, formed from A by omitting all but the first 
m rows and columns are called the diseriminants of the quadratic form. 
Moreover, as the reader may show (Problem b), 


ay = Ay; a2 = A2/Ay; ag = Ag/A2; ttj Qn = An/Anai 


Tf it is so desired, a further linear transformation q; = Vati will reduce 

XAx to the form 
qEn = ni +n Heo Ha 

Assuming that no element A, is zero, we note that instead of starting 
with A, at the first step, as we have done here, any of the remaining 
Ai, (n — 1) in number might have been chosen. At the second step, 
there are (n ~ 2) choices available and so on. Thus the final forms of Q 
and D are not unique. When some of the A; are zero or when some of the 
discriminants vanish, modifications +8 are required in the method. 


18 These cases are discussed by Bécher, loc. cit. 
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Problema. Showthatforn = 3, theelements of Omay betakenas@i2 = —Aj2/A1; 


Qi = AB/A®; Qos = 43/4”. f 
Problem b. Verify the relation given in the preceding paragraph between the 


diagonal element æ; and the discriminant. , a 
Problem c Reduce the following expression to a sum of squares: 2c] + 7x2 + 


323 + 42,22 + 82123 — 2207s. Tor one answer, a1 = 2; œ: = 5; a3 = —10. 

10.17 Orthogonal Transformations.—In this section we limit our dis- 
cussion to real orthogonal matrices, since we shall have no need for those 
containing complex elements. By definition, if R is orthogonal, R = R! 
hence RR = RR = E. One of their most important properties arises from 
the fact that transformation by them leaves the length of a veetor un- 
changed. Suppose x and y are related by an orthogonal transformation 

x= Ry; £= JŘ 
then . 
kx = ¥KRy = yy 
Our assertion is proved since žx is the square of the length of x and fy is 
the square of the length of y. On expanding RR = E we find that 


2 RpsPgs = zx Rp sq = bnq (10-42) 
s=] sa] 


These relations are the necessary and sufficient conditions that a matrix 
be orthogonal. 


From the definition RR = E, we also see that | R| x | R| = [R|? = 1, 
hence 
|R| = =l 
Let us consider two matrices 
coso sind 0 coso sing 0 
Rt =| —sinọ cose 0}; K =| —sing cos¢ 0 
0 0 1 0 0 —1 


which are easily shown to be orthogonal, the first having the determinant 
+1, the second —1. If we refer to eqs. (4-2) and (4-3), we see that they 
are both contained in eq. (42) and that the matrix RY represents a rota- 
tion of the coordinate system about OZ through the angle ¢ in the positive 
direction. Similarly, R” is the matrix for the same rotation, followed by 
a reflection in the XY-plane. These two cases are called proper and 
improper rotations. If ¢ = 0 in the latter case, the operation is a simple 
reflection; if ¢ = v, it is called an inversion, the matrix R` becomes di- 
agonal, with —1 for its elements, and the result is equivalent to a change 
in sign of the three components of a vector (x,y,z). 

Matrices similar to R* are to be used in eqs. (10-20a) and (10-21), if 
we interpret the orthogonal transformation as a rotation of coordinate 
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axes. However, if we prefer to rotate the vector x and obtain a new 
vector y, as we also discussed in sec. 10.10, we must change the sign of 
the angle in Rt. But sin (—¢) = — sin ¢, cos (—@) = cos ¢ and the 
following results are obtained since the matrices are orthogonal: 

R(¢) = K($) = R (—¢) = R(-$) 
The matrix relations are thus 


y = R(~ẹ)x = R(¢)x = R(4)x 


I 


and 


Il 


x= R(¢)y 


It follows that successive rotations of the same vector in one coordinate 
system to give new vectors y and z must be written in the form 


y = B(—¢)x; z= A(—#)y; z= A(—6)B(—¢)x 


or, if we prefer to retain the matrix elements shown in RY, we must change 
the order of the matrix product to read 


x= Rie} R(O)x 


The fact that an orthogonal transformation is beth congruent and 
collineatory makes it useful for the following reason: It has been seen 
that the congruent transformation may be used to reduce a quadratic form 
A(x,x) to a sum of squares, but the reduction is by no means unique. On 
the other hand, suppose the quadratic form has been reduced to a sum of 
squares by a congruent transformation and the clements of the transform- 
ing matrix are real. They can then be orthogonalized and normalized 
according to (42) and the resulting matrix R is both congruent and collinea- 
tory (hence orthogonal). In symbols, 


x= Ry; A(x,x) = £Ax = FR''ARy = [Ay 


where A is diagonal with the eigenvalues of A for elements. It will be 
remembered that when a matrix is reduced to diagonal form by a similarity 
transformation, the eigenvectors which form the columns of the transform- 
ing matrix X are not completely determined because the equations to be 
solved for the components of the eigenvectors are homogeneous. This 
arbitrariness now disappears, for we must fix the ratio of the components of 
(42) so that the transforming matrix is orthogonal and RR = F. 

We are now in a position to prove a statement made in sec. 10.15, 
namely that if a matrix A is symmetric and has multiple eigenvalues it 
may still be reduced to true diagonal form by an orthogonal transformation. 
Suppose A undergoes a congruent transformation by the matrix Q. Then 

noe! 


the new matrix OAQ is symmetric if A is symmetric, for (OAQ) = OAQ. 
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It thus follows that an orthogonal transformation will leave the symmetry 
of Aunchanged. This only can be true if the off-diagonal elements of the 
triangular matrices of (41) are zero, but then A is diagonal, 

Orthogonal transformations are often called principal azis transforma- 
tions since they are used in the problem of reducing a conic to principal 
axes and in finding the principal axes of a rotating body, or in reducing 
kinetic and potential energy expressions to sums of squared terms. The 
eigenvectors are frequently called normal coordinates in these cases.!? 

A similar procedure serves to reduce’? simultaneously two quadratic 
forms to a sum of squares. Suppose the two forms are A(x,x) = XAx 
and B(x,x) = £Bx. First reduce A (x,x) to a sum of squares by a congru- 
ent transformation, x = Qy, which will give 

£Ax = FOAQY = FDy = Şla:ð:;]y 

The same transformation applied to B will give 

Bx = FOBQy = FCy 
but C is not diagonal. Now make the substitution n; = V iY which 
results in 

FDy = En; FCy = Cy 

where the a; have been absorbed into C to give C’. Finally, an orthogonal 
transformation, n = RẸ will reduce C’ to diagonal form, yielding; 


Dy = ERERE = ŞE = H+ 8+ +8 
Cy = ERCRE = EAE = MH + ME +--+ ME 
Even when the two quadratic forms are not functions of the same 

variables. the transformation may often be made. For example, in the 
mechanical problem of small oscillations where it is required to find normal 
coordinates for the kinetic and potential energies, the two quadratic forms 
appear as T = YAv and V = &Bxwherev = dz/dt, T being positive definite. 
The reduction causes no difficulty since the cogredient variables x and v 
both undergo the same transformation.?! 
We show in eq. (53) that for a unitary matrix, \,A* = 1 for every i. 
Since a rea] orthogonal matrix is also unitary, it follows that the only possi- 
ble eigenvalues for a real orthogonal matrix are +1 or e***. In the latter 


See Chapter 9; for a fuller discussion, see Whittaker, E. T., “ A Treatise on the 
Analytical Dynamics of Particles and Rigid Bodies,” Third Edition, Cambridge Press, 
1927. 

20 . > . . 

- The reduction is not always possible but it can be made if one of the forms is 
positive definite, as is the case in most physical problems, 


21 : X é 
An example of this case was discussed in sec, 9.11; see also Whittaker, loc. cit., 
Chapter VII. 


327 ORTHOGONAL TRANSFORMATIONS 10.17 
CASE, p must be real and the exponentials occur in pairs with opposite 
simns. If kis the number of eigenvalues equal to —1, the determinant of 
the matrix equals (—1)*. 

_ In the previous discussion of this section, we have shown that when the 
<l22@n values are real it is possible to reduce matrices to diagonal form by 
means of an orthogonal transformation. Now suppose that R, the matrix 
to be reduced, is itself orthogonal and of order n, and that the n eigenvalues 
are -++1 occurring jı times, —1 occurring je times (jı + j2 =J <n) and 
e t th, Since the latter must appear in pairs there are an even number of 
them, Le, n— j = 2m and k = 1,2,---,m. Some simplification of the 
final diagonal matrix may be made by noting that if ¢, = 0 or r, et" 
Ccpuals +1. Thus we may write an even number of the eigenvalues +1 as 
CN Ponentials and obtain the following special cases for A, the diagonalized 
form of R. 


ho even, n = 2k 

| R | +1: A = diag (e, ebr, as ete, eth, sey eter) 

| R | = —1; A = diag (1, —1, eibi oa, etra, ei, oo, e Rm) 
n odd, n =2k+1 


ii 


| R] = +1: A = diag (1, ef, ---, ee, h, e Pr) 


| R | = -=-]; A= diag (-1, ee, 11N er, e's, ne) etr) 
(10—43) 
Wow consider the form of the matrix X which diagonalizes R: 


XORX =A 


tts r-th column is an eigenvector x, of R and its eigenvalue will be assumed 
to bee, But according to (36) 
Rx, = ex, (1044) 
hence x, will in general be complex and of the form 
/ tt 
X, = X, + 1X, 
where x! and z!’ are real. In a similar manner, it follows that the eigen- 
vector and column of X corresponding to e~"* is x! — ix. We conclude 
that usually the transforming matrix X will have some complex elements. 
ftecalling the fact that in the previous case, where X was orthogonal, the 
transformation was both collineatory and congruent we see that the neces- 
sary modification here is that the transformation be collineatory and con- 
jumetive. The transforming matrix, then, is unitary, hence we can only 


<liagonalize an orthogonal matrix by means of a unitary matrix. 
Let us see what would happen if we transform a real orthogonal matrix 
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R by another real orthogonal matrix S. 
SRS =Z 

We write (44) as hn 

RGL + ixl’) = eh (al + 2x’) (10-45) 
for the particular eigenvalue e®r. Since e®r = cos $, +7 sin r we may 
equate the real and imaginary parts of (45) to get 

Rx, = x! cos r — & SiN $r 

Rx!’ =x sing, + x!’ 008 $r 
with a similar expression for the column of X that comes from e—**. H 

o e t wor i f -tt 
we replace the complex eigenvector x, = X, + ix, by x, and x, — ix, by 
x! the resulting matrix contains only real elements and may be made 
diagonal by requiring that (42) be fulfilled. Let us call this matrix S. 
Transformation by it will give the following forms for Z: 

n even, n = 2k 
R\= +1; Z= diag (C1, Ca, . -,Cx) 
k= =l; Z= diag (1,-1,€1,C2,° ` -Cr ) 
n=2k +1 
R|; = +1; Z= diag (1,C;,C2,- ` Cr) 
R= =1; Z= diag (—1,C1,C2; ` Cr) 
COS sin 

where C; = | bk ee 


— sin d, COS Ọk 


It is worth while to point out that the only other possible two-dimen- 
sional real orthogonal matrix is of the type 


cos } sin @ 
sing —cos o 
Its eigenyalues are +1, hence such matrices cannot occur in the reduced 


form of R as we have already included all real eigenvalues in the preceding 
expressions for Z. 


Problem. Prove that R7™ and R` are orthogonal matrices. Reduce each to 
diagonal form. 


, 10.18. Hermitian Vector Space.—Since many of the matrices occur- 
ring in physical problems”? contain complex elements, it is necessary to 


22 For the use of matrix theory in quantum mechanics, see Chapter 11. For further 
discussion, see Born, M., and Jordan, P., “ Elementare Quantenmechanik,” J. Springer, 
Berlin, 1980; Wigner, E., “ Gruppentheorie und ihre Anwendung auf die Quanten- 
mechanik der Atomspektren,” Vieweg, Braunschweig, 1931. 
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amplify the vector concept presented in sec. 10.8. We write in place of 
(11), the Hermitian scalar product 


xty = siyi + zy +--+ + 2%y, (10-46) 
The square of the absolute length of a vector is then real, 
xx = xix, + ro +--+ + atn 


If xty = ytx = 0, the two vectors are orthogonal or mutually perpendicu- 
lar. Ifxtx = 1, the vector is a unit vector or normalized. For a scalar a, 


xtay = axty; (axjly = a*xty 
The Hermitian scalar product is associative 


xt(y +z) =xty + xtz 
If A is any matrix, 


x'Ay = (A'x)'y; (Ax)ty = xt(4ty) (10-7) 


10.19. Hermitian Matrices.—If the variables in the bilinear form (28) 
are complex conjugate to each other and if its matrix is Hermitian, the 
form is called Hermitian. Thus, 


H (xx) = DA yris; = xtHx; Hy = HŠ, (10-48) 
ai 


In spite of the fact that the elements of (48) are complex, the form itself 
is real, 

The eigenvalues of an Hermitian matrix are also all real. Suppose A; 
is an eigenvalue corresponding to an eigenvector x, then 


Ax =) x; xiHx = MXX 


Since xt Hx and x'x are both real, it follows that à; is real. 

An Hermitian matrix H remains Hermitian when transformed by 
either an orthogonal or a unitary matrix. To prove this statement for a 
real orthogonal matrix R, suppose HM, is known to be Hermitian and 
R'H,R = Hs. Then, since R = R-, we have H2 = RAR = RAR 
and Hi = R'HİR*. But Hi! = H, and Ris assumed to be real, so Rt = R, 
R* = R. Thus H} = RHIR = R™H,R = Ho. The proof for a unitary 
matrix is similar. 

As we have previously stated, a real symmetric matrix is a special case 
of the Hermitian matrix. Thus, except for slight modifications, the reduc- 
tion of Hermitian matrices to diagonal form is similar to the procedure 
used for real matrices. For example, an Hermitian form may be con- 
verted to a sum of squares in many ways by a conjunctive transformation. 
On the other hand, we saw in sec. 10.15 that a matrix could be converted 
to diagonal form, with its eigenvalues on the diagonal, by means of a col- 
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lineatory transformation. If the matrix is Hermitian, we may require 
that the transformation be both collineatory and conjunctive, hence uni- 
tary, and the diagonal form is then unique. The same argument which 
was used for a real symmetric matrix shows us that even if the eigenvalues 
are not all different, the true diagonal form may be obtained since trans- 
formation by a unitary matrix leaves the symmetry of an Hermitian matrix 
unchanged. 

The necessary condition that two Hermitian forms be simultaneously 
reducible to a sum of squares is that they commute.?? Suppose that 
A(x,x) = x'Hx and A(x,x) = x'Kx are given and that both H and K are 
Hermitian or unitary.°* Let S be a unitary matrix that reduces H and K 
simultaneously to diagonal forms, H’ and K’: 


H = SHS; K’=S'KS 


Clearly H’ and K’ commute since they are both diagonal, hence we may 
write 


H'K’ = S'HSSKS = S'HKS 


K'H' = S KSS HHS = S KHS 
or 
SHKS = S KHS 


since K'H’ = H’K'. Irt thus follows that WK = KH. 


Problem. Prove that an Hermitian matrix remains Hermitian after transforma- 
tion by a unitary matrix. 


10.20. Unitary Matrices.—If we indicate a unitary matrix by U, then 
from its definition 
U = (Ut)! 
hence 
Ui = U; UU = U'U=E (10-49) 


Suppose the elements in a single column of U are given by U;, then the 
Hermitian scalar product of two columns 


UU, = dj. 


A similar relation may be found between the rows. Hence the rows and 
columns of a unitary matrix of order n form a set of n mutually perpendicu- 


3 The sufficiency of this condition is proved by Weyl, H., ‘ The Theory of Groups 
and Quantum Mechanics,” Methuen, London, 1931. 

*4 If these matrices were not Hermitian or unitary, neither of them could be reduced 
to diagonal form (unless all cigenvalues were distinct, a case which is not very 
interesting). 
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lar unit vectors in Hermitian space. This may be seen at once by writing 
(49) explicitly: 
LU Uq = UU sg = Sng (10-50) 
& 8 


These equations are analogous to (42) for orthogonal matrices. 
If x and y are any two vectors, 


(Ux)'Uy = x'(UtUy) = x'y 


hence a transformation by a unitary matrix leaves a bilinear or quadratic 
form invariant. In particular, if x = Uy, then 


xix = (Uy)'Uy = yty 


This is the analogue of the fact that an orthogonal matrix in real vector 
space leaves the length of a vector unchanged. In fact, the unitary matrix 
in Hermitian vector space is the generalization of the orthogonal matrix 
for real vector space. 

The product of two unitary matrices U and V is also unitary: 


(UVY = VtUt = Vu? = (UV) (0-51) 
The reciprocal of a unitary matrix is unitary: 
("j= (U) =U = (Uy) (10-52) 


The eigenvalues of a unitary matrix may be real or complex but of 
absolute value 1. Suppose à; is an eigenvalue of U, then 


Ux =x; (Ux)itUx = xtx = MAXI (10-53) 


Since x'x is real and does not vanish, Ay = 1. 

A unitary matrix may be transformed into diagonal form by another 
unitary matrix V, the diagonal elements being the eigenvalues of U. The 
procedure is similar to that for similarity and orthogonal transformations. 
The eigenvectors must be normalized to satisfy UtU = E. The result is 


VUV = VUV = A = diag (Mi, Az sAn) 


10.21. Summary on Diagonalization of Matrices.—The matter of 
diagonalizing matrices is so useful in practice that a final and simple 
statement regarding conditions for the feasibility of this reduction seems 
in order. 

A matrix may be diagonalized (a) if all its eigenvalues are distinct (for 
procedure, see sec, 10.15a), (b) if it is Hermitian or symmetric (See sec. 
10.16 and 10.19), (c) if it is unitary (see sec. 10.20). In cases (b) and (e) 
a unitary matrix can always be found to effect the transformation while in 
{a) a more general type of transforming matrix will be needed. 
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CHAPTER 1l 
QUANTUM MECHANICS 


11.1. In conformity with the scope of this book, the emphasis of the 
present chapter is on the mathematics of quantum mechanics, the physical 
ideas entering the discussion only in a secondary way. Limitation of space 
further demands that only the important, aud this happily implies the more 
elementary, portions of the wide field be presented. Complete exclusion 
of physical ideas would, however, leave its subject matter so poorly joined 
and so incomprehensible to the student who has no prior knowledge of 
quantum mechanics that the value of an entirely formal treatment appears 
questionable. It is also true that no part of applied mathematics exacts 
from its student a more radical change from his customary habits -of 
thought, a greater tolerance for new methods of inquiry, than does this 
latest branch. In order to provide the proper attitude of mind, we preface 
the later mathematical developments by a few qualitative remarks whose 
relevance to the present book is but auxiliary. 

The central notion of classical mechanics is the mass point, or particle. 
Classical theory therefore presupposes, tacitly, that a physical system can 
in principle be recognized as a particle, or a set of particles. Until the 
advent of quantum physics this dogma has never been questioned; in 
fact scientific philosophers have frequently inflated it to the dimensions of 
a universal proposition claiming that all physical systems are composed of 
particles. The method of physical description in best accord with this 
fundamental attitude is clearly this: To correlate instantaneous positions 
of a given particle with instants of time, assuming motion to be continuous 
in space and time. Thus, if a particle moves along the X-axis, the com- 
plete description of its motion would appear in the form z = f(t). 

Now it is conceivable that such a correlation becomes impossible, and 
the question then arises whether this fundamental mode of description 
should be abandoned in such circumstances. The answer which has often 
been given and which the modern physicist emphatically rejects is the flatly 
negative one, the answer alleging that classical description is intrinsically 
evident and that the relation x = f(t) has meaning even when the func- 
tional relation cannot be established. On the other hand, one would not 
like to discard this successful description lightly, for instance because of 
certain practical and accidental difficulties in the procedure of measuring x 

© QaQ 
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as a function of ¢. The criterion which has ultimately produced clarity is 
this: A method of description must be abandoned when it becomes impossi-~ 
ble, not because of experimental difficulty, but because its use contradicts 
known laws of science. Classical description has become impossible for the 
latter reason, as the following simple example will show. 

Imagine an oscillating mass point, e.g., the bob of a pendulum. As 
long as the eye can follow the bob, correlations between z and £ can cer- 
tainly be made. But suppose the mass point is made to increase its fre- 
quency of vibration. The eye will soon be unable to perceive instantane- 
ous positions, but the camera can still establish them. When the camera 
fails, oscillographic methods may be available, and after that, ingenious 
devices perhaps not yet invented may serve. But ultimately, a barrier 
of an essential kind will be encountered. Let us assume that the bob 
oscillates 10!° times per second. It is a fact of atomic physics that visible 
light requires about 107° seconds to be emitted (or reflected). Thus if it 
were used as the medium of report, the light-emitting mass would have to 
remain in a given position for approximately that length of time. In the 
present instance, however, the bob executes 100 vibrations within this 
period. A similar argument can finally be used to invalidate every other 
means for establishing the classical correspondence. The latter has to be 
ultimately abandoned because its use contradicts the laws of optics. 

What, then, can be done? Perhaps the example suggests an answer. 
While a snapshot can in principle no longer be taken of the rapidly oscillat- 
ing bob, a time exposure would reveal some features of its dynamical 
behavior. It would give essentially a correlation between the time the 
bob spends within a given interval dr and the location of that inter- 
val, in other words between z and the probability wdx of encountering it in 
dz. This leads to a less pretentious description of the physical system 
called a mass point, of the form w = p(x), and this description is charac- 
teristic of quantum mechanics. It is to be noted that p(x) can be inferred 
from the classical relation x = f(t), but not f(t) from w = p(x). 

Quantum mechanics provides the means for deducing probability rela- 
tions of the type described, and it does so in a logically consistent fashion. 
But before turning to this central issue, let us see what has become of the 
concept: particle. Our time exposure has left it very ill defined. Indeed 
if the system called a mass point were invisibly small or never sufficiently 
stationary to permit the classical description, the customary properties of 
particles would never be exhibited. By the criterion of essential observa- 
bility, the concept would lose its physical significance. From a misunder- 
standing of this situation there has arisen a claim that quantum mechanics 
leads to a dualism, to the monstrous conception that ultimate entities of 
physics like electrons are both particles and waves: the correct statement 
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is that they are neither particles nor waves, but more abstract entities for 
the description of which quantum mechanics gives most simple and success- 
ful rules. The question as to the particle or wave nature of an electron 
must be put in the same class as that regarding its color — or, to use a lighter 
metaphor due to the philosopher Dingle, as the question concerning the 
color of an elephant’s egg if an elephant laid eggs. 

Despite this fundamental situation we shall place no ban upon the use 
of the terms particle, wave, etc.; we shall even adhere to universal practice 
in calling the electron one of the elementary particles of nature; we do this 
only, of course, as a concession to usage. But whenever a paradox arises, 
the reader should endeavor to resolve it by recalling that the “ classical 
language ’’ when applied to atomic entities is in fact metaphoric. 


AXIOMATIC FOUNDATION 


11.2. Definitions.—For the sake of brevity all historical considerations 
are omitted here. Nor will any attempt be made to “ deduce ” quantum 
mechanics either from classical physics or from outstanding experimental 
facts, for in a strict logical sense this cannot be done. We shall, however, 
present the framework of the theory with utmost economy of thought and 
space, committing the reader to the tacit understanding that all experi- 
mental consequences of the theory outlined have been verified as far as 
they could hitherto be tested. 

On a physical system, by which is meant any object of interest to physics 
or chemistry, numerous observations or measurements can be made. The 
quantities so observed or measured, such as size, energy, position and 
momentum, are called observables. It is well to think of these observables 
without ascribing to them the intuitive qualities they possess in classical 
mechanics. Position, or energy, is not so much possessed by a system as it 
is characteristic of a certain measuring process which can be carried out upon 
it. The measurement of an observable upon a system yields a number. 

In defining the state of a physical system considerable caution must be 
exercised, for we wish to remain in keeping with the requirements outlined 
in the introductory paragraphs. First it is well to notice that by state the 
scientist never means anything not subject to arbitrary fixation; indeed 
the definition of state is made to conform to the needs of each particular 
subject. It is quite different, for instance, in classical mechanics from 
what it is in thermodynamics or in electrodynamics. Hence we need not 
feel ill at ease when in quantum mechanics a new choice is made. Leaving 
elucidation until later: a state is! a function of certain variables, a function 


1 The reader who dislikes this phrase may substitute “ is represented by ” for the 
simple “‘ is.” We wish to warn, however, that the spirit of quantum mechanics permits 
no distinction in meaning between these two expressions. 
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from which by the rules of quantum theory significant information can be 
obtained. The variables may be chosen in several ways, each giving rise to 
a consistent description equivalent to all others; here they will be taken to 
be space coordinates, for this gives rise to the form of quantum mechanics 
most commonly used, namely Schrédinger’s. By state, or state function, 
we thus mean a mathematical construct, $(%1,91,21} %2,Y2,22) © ° © 2n,YnFn). 
It is possible, as we shall later see, to associate the variables zı -+ - Zn with 
the dimensions of configuration space of the classical analogue of the system 
in question. In particular, the number of variables needed in ¢ for a 
complete description of its behavior (at a given instant of time) has always 
been found to be equal to the number of its classical degrees of freedom. 
This must indeed be the case in order that large scale bodies be consistently 
described both by quantum mechanics and by classical mechanics. States 
may change with time; hence a state in its widest meaning may be written 


P ÈLIY11 - + + Sn,t) 


Certain restrictions are to be placed upon state functions, restrictions 
which will take on greater plausibility in view of the postulates of the next 
section. Most important among them are two: first ¢, which may be a 
complex function, must possess an integrable square” in the sense that 


fooro (11-1) 
where dr is the “ volume of configuration space,” ie., in rectangular. 
coordinates 

dr = dzidyıdzı ` ` * dtndynd2n 
Second, 


¢ is single-valued (11-2) 

The function œ may of course be expressed in any other system of space 

coordinates by the ordinary geometric transformations of Chapter 5. 

Condition (2) is particularly important when one of the variables is an 
angle, say a, for it then requires that 

(a) = pla + 2nr) (11-3) 


n being an integer. 

Finally we must include in our list of definitions another mathematical 
construct, that of an operator. Every specific mathematical operation, 
like adding 6, or multiplying by c, or extracting the third root, ete., can be 


2 This statement requires modification in some cases. See remarks concerning 
“continuous spectrum,” sec. 11.9c. Condition (1) must be rigorously maintained 


without exception when f dr is finite. It seems best to present the foundations of the 


theory with this restriction, leaving necessary generalizations for later. 
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represented by a characteristic symbol which is then called an operator. 
- d P g? d 

Operators are: 6-++, ¢, v ae f di, A— 3 + B— + C, and so forth. 
dz J, dr“ dx 


In general they act on functions. They can be applied in succession. 
When they are so applied, the order in which the operators occur is impor- 
tant. For convenience, let us use more general symbols for operators, such 
as Pand Q. HP stands for a+ and Q for e, then PQf means a + cf where 
fisafunetion; however QPf means c(a +f). Thus 

QPf = PU + (e — 1a (11-4) 
Such an equation is said to be an operator equation. The reader will at 
once verify that, if / stands for ð, dx and Q for r», the operator equation 

POf- QPf =f (11-5) 

holds. 

There is an important difference between eqs. (4) and (5); the second 
is homogeneous in f, the first is not. From the second, f may be canceled 
symbolically so that it reads 

PQ-—-QP =1 (11-5) 
Only homogeneous operator equations of this kind, usually written in the 
latter form without explicit insertion of the operand f, are of interest in 
quantum mechanics. 

The formalism of operators is convenient also in other ways. It is 
possible, for instance, to define a periodic function (x) by writing 


P b(n) = plr) 
D being d/dg; for the left-hand side is, on expansion, simply the Taylor 
series for ọ(x + h). 
Two operators, P and Q, are said to commute when PQ — QP is zero. 
Thus c and d/dz commute if cis a constant. Other examples of commut- 


b 
ing operators are: x and @/dy; ddr and f dx if a and b are constants; 


a+ and (—}). Clearly, every operator commutes with itself or any power 
of itself, provided that by the n-th power we mean the n-fold iteration of 
the operator. 

11.3. Postulates.’—a. The fundamental postulates of quantum me- 
chanies are three in number. The first concerns the use of observables. 


3 Henceforth in the present section, and in all subsequent sections up to 11.25, 
states will be supposed to be independent of the time; i.e., ọ does not contain i. Such 
states are known as slalionary ones, and the part of quantum mechanics dealing with 
them will be called quantum statics. In quantum dynamics, introduced in see. 11.25, a 
new postulate (Schrédinger’s “ time ” equation) will be needed. This postulate is not 
included in the present list. Nor do we include the Pauli principle, which is also of 
axiomatic status, and which will be presented in see. 11.33. The present limitation is 
made for pedagogical reasons. 
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Brief reflection will show that classical physics associates with observables 
certain definite functions of suitable variables: zx, y, z with position, mw 
with linear momentum, $m»? with kinetic energy, and so forth. These 
functions are chosen to describe experience most adequately. There is no 
logical reason which would exclude the use of more abstract mathematical 
entities in this association. It has indeed been found that, for the descrip- 
tion of atomic phenomena, certain operators should replace the functions 
which in classical mechanics represent observables. The first postulate 
may be stated as follows: 

To every observable there corresponds an operator. 

The correct operator to be associated with a given observable must be 
found by trial. In the following table we give a brief summary of the four 
most important operators of quantum mechanics; the observables in 
question are understood to refer to systems classically described as groups 
of mass points having 3n degrees of freedom (j = 1, 2,---,n), subject to no 
external forces (total energy constant) and not requiring relativity treat- 
ment. The first column gives the name of the observable, the second its 
classical representation, the third its quantum mechanical representation. 


Cartesian coordinate Ti xy 
, o, ha 
Cartesian component of Pz; = M;i; -L 
linear momentum of j-th t Ox; 
particle 
hf, 9, 2 
X-component of angular m;(yj2; — 2y4;) I az; ay au; 
momentum of j-th particle 
Total 1 Pe 1 fa 
2 2 -zain 
onal energy iL (pay + Pig + pi) 2 721m; o? 
7 
2 
+ V (zi> Za) CAC 
„yj ò 
+ V (z1: en) 


m; is the mass of the j-th particle; A is an abbreviation for Planck’s 
constant, h, divided by 27. 

The operator form of the Cartesian coordinate z-, is identical with its 
classical representation and has been included only for formal reasons. 
Linear momentum, a differential operator, is basic in the construction of 
the last two entries in the table. 

When the operator corresponding to the linear momentum p of a single 
particle is written in the vector form —7V, those corresponding to angular 
momentum and energy of this particle may be constructed according 
to classical formulas: Angular momentum = r X p = —iħr X V, and 
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energy = (1/2m)\p? + V = —(#?/2m)v? + V. These vector forms are 
valid in all other systems of coordinates and should be used as the basis for 
transformations. 

In view of the table, the reader will easily verify the following operator 
equations: 

Let Qy stand for the operator “ k-th Cartesian coordinate,” Pẹ for the 
k-th component of linear momentum. Then 


PkQi — QUPe = — ihr (11-6) 


Also, if Lz, Ly and L; denote the components of the angular momentum 
operator for a single particle,* 


LzLy — Lyle = th, 
LLa — LiL, = th, (11-7) 
Ll, — LiL, = thL, 


Commutation rules, like (6) and (7), are often sufficient to define the 
operators involved without recourse to their explicit form, but the latter is 
usually helpful. 


b. The second postulate states: 
The only possible values which a measurement of the cbservable whose 
operator is P can yield are the eigenvalues p, of the equation 


Ph = puha (11-8) 


provided yp, obeys conditions (1) and (2), namely: f WShdr < © andy is 


single-valued. 

The range of integration depends on the particular problem under con- 
sideration, as will be seen later. 

We illustrate the meaning of this postulate by a few examples. Let us 
find the measurable values of the linear momentum cf a particle, known to 
be somewhere on the X-axis between the finite points z = a and x = b. 
The operator P is —ih(d/dxz). Eq. (8) therefore becomes a first-order 
differential equation which can obviously be satisfied if Y, is assumed to be 
a function of z only. It reads 


„d 
-i = DAVA (11-9) 


and has the solution 


dy, = coli exe 


1 Ly and L; may be obtained from Lz in the table by cyclical permutation of coordi- 
nates. 
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Is this solution satisfactory from the point of view of eqs. (1) and (2)? 
It is certainly single-valued; moreover, f Pidrndz = (b — a)c*e is finite 


for every finite c. Hence no restriction upon py results; ad! values of the 
linear momentum may be found upon measurement. The eigenvalues of 
the linear momentum form a continuous spectrum (A is not a discrete 
index) and every function of the form ce“/** with constant p is an 
eigenfunction. As far as measurable values of linear momentum are con- 
cerned, quantum mechanics leads to the same result as classical physics. 

This is not true for the angular momentum of a single particle. Here 
eq. (8) reads 


ih (22 - v2) va = my (11-10) 
ay 


provided we consider the z-component and write my for the eigenvalues. 
Obviously, yn must be a function of both z and y. But a simple trans- 
formation of coordinates reduces the equation to a simpler form. On 
putting x = r cos 9 and y = r sin @, we have 


d ng ð 4 9 ô ð ð 
— = —r sin f — cos 8 — = z — — y — 
do r ox r oy oy Y ðr 
Therefore eq. (10) becomes 
Wr 
— th, PI = mv 
and yn is seen to be a function of 0 alone. The solution is 
dy = cet Ame 


It certainly has an integrable square, because the range of 0 extends from 0 
to 2r, or more exactly, from 2an to 2x(m + 1), where n is an integer. But 
yy violates the condition of single-valuedness which must be imposed in 
the form (3). To satisfy it we must require that 


(9) = hO + 2r) 
and this implies e27/"™ = 1. This is true only if 
m, =A, an integer (11-11) 


Hence the only observable values of the angular momentum are given by 
(11), and the eigenfunctions are ce’. This result is identical with the 
postulate of the older Bohr theory concerning angular momentum. 

Next we consider the possible values of the total energy of a single mass 


point. The energy operator appearing in the table is often referred to as 
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the Hamiltonian operator and is denoted by the symbol H. Let us use E, 
for the eigenvalues. The operator equation then becomes 


Ko. 
Hh = = z Vn + VEn = Esh (11-12) 
This equation, written perhaps more frequently in the form 
2 2m 
Vh + 5 (Bx — Vib = 0 (11-12) 
was found by Schrödinger and bears his name. Its solutions and eigen- 
values clearly depend on the functional nature of V (x,y,z); they will be 
reserved for detailed consideration in secs. 9 et seq. 


A rather peculiar result is obtained when (8) is applied to the coordinate 
“ operator.” The eigenvalues of “zr” are the values & for which the 


equation 
zw = othr 
an ordinary algebraic one, possesses solutions. On writing it in the form 
(z= aha = 0 


it is evident that either x = & or % = 0. In plainer language, yn as a 

function of x vanishes everywhere except at z = 4, a constant. From a 

rigorous mathematical point of view such a function is a monstrosity, but 

it is useful for certain purposes to introduce it, as Dirac has done. It is 

called 6(c — £), the symbol being fashioned after the Kronecker ô, and is 

best visualized as something like lim ce" Ee For later use the con- 
Q 


w% 


stant c(a) will be so chosen that f d(x — E)dx = 1,80 that 


— a 


f Ie- dar = 18) (11-13) 

Now it is clear that such a ‘‘ function ” can be formed for every value &, 
hence every point of the X-axis is an eigenvalue of the z-coordinate.® 

The significance of the second postulate is best grasped when it is 

regarded as furnishing a catalogue of the measurable values of all observa- 

bles for which operators are known. It implies no information concerning 

the meaning of the eigenfunctions ya. These are, of course, states of the 


5 Dirac, P. A. M., “Principles of Quantum Mechanics, Third Edition; Clarendon 
Press, Oxford, 1947. 
ê The operator x has a continuous spectrum. Correspondingly, the integral 


fee — £)dz does not exist! See sec. 11.9e. 
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system in the sense explained. Their nature will unfold itself when the 
third postulate has been set forth. For the present we only note that 
every ya is indeterminate with respect to a constant multiplier; eq. (8) 


will also be satisfied by constant- yn. On the other hand, f Vihdr 


exists. We may require, therefore, that y is normalized after the manner 
of sec. 8.2. Henceforth this will be assumed unless a statement to the con- 
trary is made. In this connection it may be recalled, however, that 
normalization may fail intrinsically when the eigenvalues p, form a con- 
tinuous spectrum. In Chapter 8 this was shown to be the case in instances 
where the range of the fundamental variable became infinite. These 
require special treatment. 

The yy will be orthogonal if operator and boundary conditions conform 
to the circumstances of the Sturm-Liouville theory (sec. 8.5). This theory, 
as will later be seen, covers most of the cases occurring in quantum mechan- 
ics, but must be generalized somewhat to be applicable to complex opera- 
tors. 

e. We turn to the third postulate which states: 


When a given system is in a state œ, the expected mean of a sequence of 
measurements on the observabte whose operator ts P is given by 


p= f “Podr (11-14) ` 


The expected mean is defined as in statistics: If a large number of measure- 
ments is made on the system, and the measured values are p1,P2, `°", PN; 
N 
then 6 = 1/N £ p: Note that eq. (14) does not predict the outcome of a 
tol 
single measurement. 
In writing (14) we are again supposing that ¢ is normalized. This can 
be brought about in all physical problems by “ confining ” the system in 
configuration space, that is, by taking the volume in which it moves to be 


finite, so that f dr exists. Even if the volume is infinite, f p*ġdr may 


still exist, but in general the situation then calls for special treatment in- 
volving the use of eigendifferentials instead of eigenfunctions.* A more 
general form of eq. (14), which often works when the volume of configura- 
tion space is infinite, is the following 


J o*Pédr 
öğ = lim <2 


TT f erod 


* See Morse, P. M., and Feshbach, H., “Methods of Theoretical Physics,” McGraw- 
Hil Book Co., Ine., 1953. 


(11-14') 
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We illustrate the meaning of (14) by a few examples. Let a 
system having one degree of freedom be in a state described by 
$ = (b/r) e7 ®/D E-E Then the mean value of its position will be: 


i= f rdr = £ 
its mean momentum: 
Be = —ih fosas =0 
its mean kinetic energy: 
h? n h? a2 b K 
Biin = -Sh de = 5 f Oie 2m 


It is interesting to note that, the more concentrated the function ¢ 
(the greater b) the larger will be the mean kinetic energy. To calculate 
the mean total energy we should have to know the form of V (z). 

Let us take @ = e™/(b — a)? We then find 


z 


il 
=~ 
o~ 
© 
* 
& 
D 
fu 
R 
li 
o 
+- 
R 


Bs = —th | ọ*ġ'dz = ki 


= b ER? 
Ekin = — =f ġ*ġ dz = — 
a m 


If in this example the range is extended to infinity, let us say in such a 
way that —a = b -> œ, the function e“* can clearly not be normalized 
One must then use eq. (14’) in the form 


which gives the same results as those obtained above. 

The three postulates here stated and exemplified do not reveal an 
intuitive meaning of the state function œ. It is therefore not unusual in 
textbooks on quantum mecnanics to add another postulate stating that 
¢*(x)¢(x) signifies the probability that the “ particle” whose state is 
$ be found at the point 2 of configuration space (with suitable generaliza- 
tion for more than one degree of freedom). This is indeed true, and it may 
be well for the reader to form this basic conception; but this statement is 
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not a further postulate since it may be deduced from those already given. 
(Cf. see. 6.) 


DEDUCTIONS FROM THE POSTULATES 


11.4. Orthogonality and Completeness of Eigenfunctions.—In Chapter 
8, orthogonality and completeness of the eigenfunctions belonging to the 
Sturm-Liouville operator L have been diseussed. The proofs there given 
need to be generalized if they are to be applied to quantum mechanics, for 
the operators occurring there are not all of the same structure as L. (One 
of the most important equations encountered, the one-dimensional Schré- 
dinger equation (12), is of the Sturm-Liouville type.) They often involve 
many variables, they may be differential operators of the first order, they 
may be complex; in fact they may not be differential operators at all. 
To simplify the theory we shall assume that the eigenvalues py of eq. (8) 
are discrete, and that the boundary conditions on acceptable state func- 
tions are of the form 1 and 2. Whenever convenient we shall even assume 
that @ vanishes at the boundary of configuration space, over which inte- 
grations are to be carried out, in a manner suitable to our needs. Unless 
these restrictions are made the arguments become involved and in some 
respects problematic. It would then be necessary to conduct a separate 
proof for every problem of interest; thus elegance would fall prey to rigor. 

We first define what is meant by an Hermitian operator. Let u and v 
be two “ acceptable ” functions, defincd over a certain range of configura- 
tion space r. We then say that the operator P is Hermitian if 


foe Podr = f v- P*ukdr (11-15) 


All operators of interest in quantum mechanics have this property. Asa 
sample proof we show this for the linear momentum P; = —7h(d/dq;), 
associated with the j-th Cartesian coordinate: 


ð ð 
f #Pear = mth ft dr = -ih fut dy dan 
r qi ðq 


J r I 
First perform the integration over g;, which yields 


ð 
-a f užvdgi - - - dq;1dqj41- +: dga + th f "a 
e T I 


The first integral, a “ surface ” integral taken only over n — 1 coordi 
nates but with u and v evaluated at the end points of the range for q,, 
will vanish provided u and v vanish sufficiently strongly for these extreme 
values of g;, which is what we are supposing. The remaining integral is 


indeed identical with f vP*u*dr. 
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The Hermitian property of xz is obvious. To prove it for the Hamil- 
tonian H, two partial integrations are necessary; the details may be left 
as an exercise for the reader. 

Hermitian operators have real eigenvalues. This fact follows at once 
from eq. (15). The eigenvalues of P are defined by the equation 


Ph = paha (11-16) 
This also implies the validity of the equation 
PE = py (11-17) 


Now multiply (16) by ¥¥ and (17) by yy, and integrate over dr obtaining 


f txPddr = py f Padr 


f PrP phdr = py f UXndr 


By (15) the left-hand sides of these two equations are equal, for y) is cer- 
tainly an acceptable function in the sense outlined before. Hence p¥ = py; 
i.e., p is real. Since the eigenvalues of operators are measurable values of 
observables, which must of necessity be real, the physical significance of an 
operator is assured when it has the Hermitian property. 

Let us again consider eq. (16). If ¥, is some other eigenfunction, it is 
evident that 


J y*Phdr = py f Undr (11-18) 
But if we start with the equation 
PT = Dis 
which is true because p, is real, we also conclude that 
f PP Ykdr = p, f Vid (12-19) 


Combining (18) arid (19) we find 


S 
< 
> 

a 
x3 

| 
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x 
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* 
< 
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(Px — Pa) f Pey,dr 


If P is Hermitian the left-hand side vanishes. Hence either pa = p, or 
f VAydr = 0. We see that eigenfunctions of Hermitian operators, belong- 


ing to different eigenvalues, are orthogonal. 
The completeness of the eigenfunctions of all operators employed in 
quantum mechanics is usually assumed. To the authors’ knowledge, a 


11.6 QUANTUM MECHANICS 346 


rigorous proof has not been given. Since, however, our main interest will 
be in the Schrédinger equation which is of the Sturm-Liouville type, this 
point need not detain us further. In the following we shall assume com- 
pleteness of all yy whenever this property is needed. 


Problem. Show that the angular momentum operator L: = —ih(@/d0) is Hermitian, 


11.5. Relative Frequencies of Measured Values.—Important conse- 
quences can now be deduced from the third postulate, eq. (14). We first 
note that, if P is Hermitian, every power of P is Hermitian. Moreover, if 
(14) is true for every operator P, it must certainly hold for the operator 
P". It implies, therefore, 


pa = f P'odr, r=1,2,-:: (11-20) 


The left-hand side stands, of course, for the r-th moment of the statisti- 
cal aggregate of the measured values, i.e., 


P = Epp (11-21) 


provided p; is the relative frequency of the occurrence of the i-th eigen- 
value p; in the set of measurements. In accordance with eq. (20), the 
state function ¢ predicts not only the mean, but all moments of the aggre- 
gate of measurements.” Now eq. (20) may be transformed as follows. 
Let the eigenfunctions of P be denoted by yy, so that Py, = pa. On 
allowing P to operate on both sides of this equation, there results 
P? = pP = pry. By continuing this process, the relation 

PY) = Dwr (11-22) 
is established. If the function ¢ appearing in (20) is expanded in terms 
of the ry, 

$ = Lay: 


and this series is substituted, we find 
P = f LatavtPbsdr = Eata; f Vad 
= E ap; 
by virtue of (22) and the orthogonality of the ¥; Comparing this with 


(21) it is clear that 
Loi = 2| ay I? pi 


7 For terminology, see sec. 12.3. 
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for every integer r. But this can be true only if 
pi = | a; |? (11-23) 


In words: when the system is in the state ¢, a measurement of the observa- 

ble corresponding to P will yield the value p, with a probability (relative 

frequency) | a; |?, a; being the coefficient of Y; in the expansion 6 = Zior, 
x 


and yy is one of the eigenfunctions of P. The coefficients a; are called 
probability amplitudes. 
They may be expressed in terms of ¢ and y; by the relation 


f Year = E f tohar = q; (11-24) 


Consequently, eq. (23) may also be written 


pi = | f vfodr l? (11-25) 


An interesting result is obtained when, in this equation, we let ¢ be 
one of the eigenfunctions belonging to the operator P itself, e.g., Yj It 
then reads 


pi = | f vide | = ôy 


All relative frequencies are zero except the one measuring the occurrence 
of the eigenvalue p;, which is unity. Thus we conclude that an eigenstate 
y; of an operator P is a state in which the system yields with certainty the 
value p; when the observable corresponding to P is measured. Eigen- 
functions are simply state functions of this determinate character. 

11.6. Intuitive Meaning of a State Function.—Consider now a system, 
like a simple mass point with one degree of freedom, whose state function is 
(x). We wish to know the probability that a measurement of its position 
will give the value x = & The eigenfunction corresponding to the opera- 
tor x: for the value £ has been shown to be 


ve = d(x — $) 
Eq. (25) now reads 


re =| f ae — powdr? = | 6 P (11-26) 


by virtue of (13). The probability (density) of finding the system at £ is 
given by the square of its state function. This fact provides a simple 
intuitive meaning for the state function. It can be generalized to several 
dimensions. 
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Let g3, g2, °° +» Qn be the coordinates on which @ depends. Using the 
former arguments, the eigenfunction corresponding to the composite coordi- 
nate operator 91'@2° -- * Qa may be shown to be 


Was trent, = êl — Eele — £2) ++ lda — En) (11-27) 


If, therefore, we wish to find the probability p,,;,...,, of finding the system 
at the point (££, - -- £n) of configuration space, we must use eq. (25) with 
y; replaced by (27). Hence 


Pim tn = 
Lf ff a — 8) ++ alan tolan andado: da |? 
= | oliko tn) |? 


11.7. Commuting Operators.—Let P and R be two operators satisfying 
the relation PR — RP = 0, and let their eigenfunctions be yy and x,, 
that is 


Phy = pir, Rxy = TpXu (11-28) 


We assume the state function to be y; so that, when P is measured, there 
results with certainty the valuc p;. But 


RPh; = PRd,; = pkty; 


Considering only the last two members of this equation, we may say that 
(Ry) is an eigenfunction of P, namely that belonging to the eigenvalue p; 
But this is possible only if Ay; = const. Y; Comparison with the second 
equation (28) shows the constant to be one of the r,, and y; to be one of the 
eigenfunctions x,. We conclude that commuting operators have simul- 
taneous eigenstates; i.e. measurements on their observables yield definite 
values for both; they do not “ spread.” 

The fact that, when P and Q are non-commuting operators and the 
state of the system is an eigenstate of P, measurements on Q will give a 
statistical aggregate of values and not a single one with certainty, is usually 
attributed to the interference of measuring devices. For instance, the . 
measurement of a particle’s position disturbs its momentum, and vice 
versa, so that when one is ascertained with precision, the other quantity 
loses it. From this point of view, measurements on the observables asso- 
ciated with commuting operators are said to be compatible, the procedures 
of measurement do not conflict with each other. 

11.8. Uncertainty Relation.—The proof of the famous Heisenberg 
uncertainty principle which will now be given requires the use of an inequal- 
ity, similar to a well known relation due to Schwarz, though not identical 
with it. (Cf. eq. 3-112.) It states: if uand v are any two “ acceptable ” 
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functions in the sense specifed in connection with the definition of Hermi- 
tian operators (sec. 11.4), then 


J u*udr - f v*odr = | f (uty + wuar) (11-29) 


We assume a system to be in a state ¢, which need not be an eigenstate 
of any particular operator, and we are interested in the results of measure- 
ments on the observables belonging to two operators, P and Q, at present 
unspecified. Introduce into eq. (29) the following functions 


u=(P—p)d and v=1(Q—9)¢ 


where P and g are mean values associated with P and Q through the rela- 
tion (14). Eq. (29) then reads 


J P- ErP - Deir f Q- DQ- Dear 2 


2 
aS (P — p)*6*(Q — G)¢dr — if (Q = g)*6*(P — Psar | 


Now P and Q are Hermitian and satisfy eq. (15); p and q are constants. 
Therefore the inequality reduces to 


2 
f o*(P — p)*odr- f o*(Q — Podr = -4f J *(PQ — OP ede | 
(11-30) 


Let us consider the meaning of the quantity f o*(P — 2)*odr. When 
¢ is expanded in eigenfunctions y of P, ¢ = Slay, and the expansion is 
` 
introduced in the integral, the result is Z| a |?(p, — P)’, and this, in view 
A 


of eq. (23), is nothing other than the dispersion? of the statistical aggregate 
of pmeasurements about their mean. For this quantity we may intro- 


duce the more familiar symbol Ap?. A similar identification is to be made 
for f o*(Q — @)*odr. Inequality (80) then takes the more interesting 
form 


eee 2 
Page -i| feee- owa] so 


8 The “dispersion ” is the square of the so-called “standard deviation.” It is an 
index of the “ spread ” of the measurements. See Chapter 12. 
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Now if P and Q commute, the right-hand side is zero, and it is possible for 
Ap or Ae to be zero, or even for both to vanish. This state of affairs 
recalls the result of sec. 7, which was that both p- and g-measurements 
could yield single values without spread. 

When P and Q do not commute, relation (31) sets a lower limit for the 
product of the dispersions, often called uncertainties. Suppose, for 
instance, that P is the operator —2i(@/dg), the linear momentum associ- 
ated with g, and Q stands for the coordinate g. We then have 


PQ — OP = th (11-32) 
When this is put into (31) the result is Ap? Ag: = h?/4, or, written in 


terms of standard deviations, dp and êq, 


êp- og = (11-33) 


bo | sk 


This is Heisenberg’s uncertainty relation. 

Qur result need not be cast in the form of an inequality. It is indeed 
quite pessible to calculate both dp and ôg separately and exactly when the 
state function ¢ is given, as the postulates show. 

A slight generalization of the present conclusions is also possible. 
There are other operators, such as L, and @ (ef. eq. 10 et seq.) which also 
obey eq. (32). In fact all quantities which are called canonically conju- 
gate in classical physics? have operators which satisfy it. (Later we 
shall see that energy and time belong to this class.) For all these, the 
uncertainty relation in the form (33) is valid. 


Problem. Show that, if the state function œ is an cigenfunction of the angular 
momentum operator L, corresponding to the eigenvalue /,, the product of ôl; and dly 3s 
at least as great as (A/2),. 


SCHRODINGER EQUATIONS 


Attention will now be given to the eigenvalues and eigenfunctions of 
the energy operator, that is, to the solutions of the various forms of the 
Schrödinger equations, eq. (12). 

11.9. Free Mass Point.—The simplest example of a physical system 
is the free mass point for which the potential energy V may be taken to be 
zero. In that case eq. (12) reads 


Vy +k =0 (11-34) 


provided we omit the subscript \ and write k? = 2mE/h?. This quantity 
k? has a rather simple classical significance which it is well to recognize at 


3 Cf. sec. 9.2. 
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once. For if # is the total energy of the particle, which is in this case 
purely kinetic, then E = 3m? = p?/2m. Hence k = p/h, p being the 
classical momentum of the particle. Note also that k has the dimension 
of a reciprocal Jength. 

Eq. (84) has already been solved in Chapter 7 (cf. eq. 7-33), where it 
appeared as the space form of the wave equation. To select the proper 
solution, we must consider the fundamental domain, r, of our problem. 
Here, a great number of possibilities present themselves. 


a. Enclosure is a Parallelepiped. If the particle is known to be within 
a parallelepiped of side lengths i, lg, and la, then 7 is this volume of space. 
Moreover, since | wv (zyz) |? has already been identified as the probability 
of finding the particle at the point xz, y, z, this quantity must certainly be 
zero everywhere outside r. For reasons of continuity (which can, by more 
expanded arguments, be shown to result from our axioms) we require that 
| v |2, and hence y itself, shall vanish on the boundaries of 7 also. In view 
of this boundary condition, the solution of (34) in rectangular coordinates, 
namely eq. 7-36, must be chosen. In more explicit form it reads 


y = (Aye? + Bye?) (Age + Boe) (Age™* + Bye”), 
k? = ki + ke + ks 


The origin of the parallelepiped may be taken in one comer. Vanishing 
of y at the boundary then requires: 


A,t+ B,=0, Age** + Be ™ = 0, s = 1, 2,3 


The first. condition makes each parenthesis of y a sine-function; the second 
implies 


Net 
k= = 
ls 


where n, is an integer. Hence 


y= csin (= r) sin (z v) sin (= 2) (11-35) 


and 
2 2 
ny Ne n3 
e- (++p). 
i 2 3 
so that 
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If y is to be normalized, f y*ydzdydz = 1, and the constant c has the value 


g 1/2 8 1/2 
5 (iz) 7 G 


The permitted energy values form a denumerably infinite set. Their 
arrangement is best represented by constructing a lattice of points filling 
all space, with the ‘ reciprocal ” parallelepiped of sides 1/4, 1/le, l/l as 
erystallographic unit. If from a given point lines are drawn to all other 
points, the squares of the lengths of these lines (multiplied by 2°h?/2m) 
are the energics of our problem. However, not all these lines represent 
different states. The function y changes only its sign when one of the 
integers 71,2 or ng changes sign; it is not thereby converted into a new, 
linearly independent function. Hence only the lincs lying in one octant 
of the lattice, with the origin of the lines at one corner, will represent 
different states. If some of the l’s are equal there will be degeneracy (cf. 
sec. 8.6), for then an interchange of the corresponding n’s will not produce 
a different E, while y will be changed into a function which is linearly inde- 
pendent from the original one. 


b. Enclosure is a Sphere. Eq. (84) must now be solved in spherical 
coordinates. But this has already been done in sec. 8.4 (cf. eq. 8-25), 
for an acoustical problem. The eigenfunctions are, aside from a normaliz- 
ing factor, y = ¥/(0,e0)r' ? Jin, a(kr). The permitted energies are deter- 
mined by the condition J1.1,2(4a) = O where a is the radius of the en- 
closure, For any integer l, there will he an infinite set of roots of Jy41,2 
which we shall label rin, n = 1,2,-+-, ©. The permitted 4’s are therefore 


fan 
G 


Kean -= 


and hence Æ, which will also depend on two indices (quantum numbers) 
is given by 

7 f? ne 2 

By, = bma? (rin) 
The simple model treated here is called the “ infinite potential hole.” It 
forms the basis for many nuclear quantum mechanical calculations and is 
one of the favored starting points for considerations leading to nuclear 
shell structure.* A solution of the potential-hole problem with finite 
wallst requires the use of Bessel functions inside, Hankel funetions out- 
side the hole. The sequence of the energy values is unaltered, but all 
levels are depressed. 

* Mayer, M. G. and Jensen, J.H.D., “Elementary Theory of Nuclear Shell Struc- 


ture,” John Wiley and Sons, Inc., New York, 1955. 
t Margenau, H., Phys. Rev, 46, 613 (1934). 
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c. No Enclosure. When the particle is allowed to exist anywhere in 
space, the former boundary conditions need not be applied. The simplest 
way to treat this case is to return to ease (a) and permit h, le, and ls to 
become infinite. Let us first consider the eigenvalues. The lattice of 
points will condense as the l’s increase, until finally it forms a continuum; 
the energy states (lengths of connecting lines squared) will also move eloser 
and closer together until finally all (positive) energies are permitted. A 
similar effect may be brought about by increasing the mass of the particle, 
as a glance at eq. (36) will show. Quantum mechanics indicates no quanti- 
zation of the energy for particles which are not restricted in their motion, or 
which have an infinite mass. 

What happens to the ¥-function, (35), as the l’s increase? Clearly, the 
normalizing constant c tends to zero, causing y also to vanish. The mean- 
ing of this is quite simple: As the space in which the mass point moves 
increases indefinitely, the chance of finding it at a given point, | ¥(z,y,z) |°, 
approaches zero. The failure of the normalization rule is therefore not 
merely a3 mathematical phenomenon, but physically reasonable. To ci- 
cumvent it, several procedures may be employed. One is to suppose that 
there is an infinite number of particles in all space, N per unit volume, and 


accordingly to put f | y (žer, taken over a unit of volume, equal to N. 


This leaves c finite.!? 

When there are no boundary conditions the y-function need not be 
written as a product of sines. In fact in the absence of an enclosure sine, 
cosine and exponential functions are equally acceptable. Hence we may, 
if we desire, write , 

ve =ce(kje'', E= He 


using the notation explained in connection with eq. (38) of Chapte: 7. 


Problem. Calculate eigenfunctions and eigenvalues of a free particle enclosed in 
a cylinder of radius a and length d, obtaining 


y= cellina/diatmel y (ap) 
where ea is a root of Jm, 


11.10. One-Dimensional Barrier Probiems.—For a one-dimensional 
problem the Schrödinger equation is 
ay 2m 
—; +— [EF -Vix)ly =0 
dz? A? { ( Vv 
10 Another procedure is discussed for instance in Sommerfeld, A., ‘‘ Atombau und 
Spektrallinien,’’ Vol. IT. 
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Let us take V to be the step function given by the solid line in Fig. 1, 
that is: V = 0 if x< 0, V = V = constant if z > 0. The solutions for 
the two regions are easily written down: 
di = Ape + Be **, 2 <0 (left of 0) 
p, = Ae? + Boe, 2 > 0 (right of 0) 
with _ 
2mE V2m(E — V) 


ki = P and kr = hi 


Y = const. 


Oo a oad 
Fig. 11-1 


But how are they to be joined? The differential equation tells us that y” 
suffers a finite discontinuity as we pass across the discontinuity in V. The 
increase in y” in crossing the origin will be 


E 
lim | yde = lime Wf’ +W’) = 0 
£30 ve 0 


Hence y’ (and a fortiori y) remains continuous at the origin. The con- 
stants A and B must therefore be fixed by requiring 


P(O) = ¥-(0); (0) = ¥ (0) 

In addition to these two we have an equation expressing normalization, 
three relations in all. However, there are four constants (A),A;,B1,B,) to 
be determined. The mathematical situation is therefore such that one of 
them may be chosen at will. Let us then put B, equal to zero. The physi- 
cal meaning of this will at once be clear. 

On applying the continuity conditions we have 

Ai + Br = Ay; ky(Ar — Bi) = kA, 
whence 


Bı = Ay (11-37) 
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The coefficients A and B have a simple significance. Let us analyze 

from our fundamental point of view a state function of the form 
— % — M . 

Y = Ae" + Be, In view of the third postulate (eq. 14’) it represents 

a mean momentum 


We have intentionally left the limits of integration indefinite. In evalu- 
ating the integrals occurring here we assume that the range of integration is 
very much larger than the wave length of the particles, 2r/k. The inte- 
gral over the last two terms of y*y = A*A + B*B + AB*e™ 4 
A*Be** will then vanish, and 


f veer = (AP + | BPD 
l being the range of integration. By a similar procedure, 


fvvac = ik(| A |? — | B |l and fiva = (| A)? +] B/D 


Hence 

~.gl4APcleP on. Be pe 

B= Eg eE while p = k*h , 
It will also be observed that y is an eigenstate of the operator( — i 3) ; 


öğ 
but not of —7i —. 
Ox 


Translated into particle language, this state of affairs must be expressed 
as follows. Since all particles have a root mean square momentum of 
magnitude kh, and yet the mean momentum along z is smaller than ki, 
some of them must be traveling to the right, others to the left, with momen- 
tum kh. If a fraction a travels to the right and £ to the left, 


(a — Bjk = P, (a + B)\kh=-VP 
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-à= D/C + 5) ap 


In our problem, 8/a is the reflection coefficient of the barrier of potential 
energy V. In view of eq. (37) it is given by 


_ leak P 
| ki + k, |? 


Two cases of interest may be distinguished, (a) H < V, (b) E > V. In 
classical mechanics, a particle would certainly be reflected in case a, 
(R = 1), certainly transmitted in case b, (R = 0). The matter is not 
quite so simple in quantum mechanics. In case a, ky is real but k, is 
imaginary. is thus always 1 in agreement with the classical prediction. 
But in case b both k; and k, are real, and R <1 but not zero. Hence 
every potential barrier reflects particles, even though classically one would 
expect them to be only retarded. 

Before leaving this matter, we must justify the procedure of setting B, 
equal to zero. This is now seen to mean omission of a beam of particles 
travelling to the left in the region to the right of the origin. Had such a 
beam been included, the physical condition corresponding to y would have 
implied the incidence of two beams of particles upon the origin, one from 
the left and one from the right. In that case, 8/a is not the reflection 
coefficient of the barrier. The ¥-function we have chosen permits that 
interpretation, for it corresponds to one beam incident from the left, one 
reflected and one transmitted beam. 


whence 


Problem. Prove that Pp is the same whetherit is computed to the left or to the right 
of the origin [use conditions (37)]. 


A study of more complicated barriers, such as that depicted in Fig. 2, 
reveals a new and striking feature: the “ tunnel effect.” The energy E 
of the incident particles is assumed to be greater than V, and V3, but 
smaller than Ve, so that from the classical point of view every particle 
would certainly be reflected. Tf we define 


2m 2m 2m 
Mase EVD; ê= sta V- E); = SFE Va) 
the -functions for the three regions are 
vi = A” + Bye, 2 <0 


Yo = Ae? + Roe ™, O<z<a 
v3 = Aat, ¿>a 
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The continuity conditions for ¥ and y’ at both z = 0 and z = Q are 
seen to be: 


A, + By = Ao + Be 
thy (Ay — By) = Klg — Bz) 
dAge** + Boe = Age*2 


c(Ace™ — Bo") = tkgA set 


v 


v; 


| 
| 
a 
Fic. 11-2 


From tnese, B1, A2, and By may be eliminated. When this is done we 
obtain the relation 


Ay = $Age™* I(2 + 2) cosh «a + (4 — 5) sinh ral (11-38) 
| ky ky K 


An argument similar to that which led us to identify the reflection 
coefficient R with | B|?/| A |?, shows the transmission coefficient of the 
present barrier to be 


| As |?ks 


T = 


| Ai Pk 
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This may be computed from (38). In doing so we assume that «a > 1 
so that both cosh xa and sinh ca become $e“. Then 


pT eke 


As the width of the barrier increases, the factor ¢~?"? (sometimes called 
the “ transparency factor ”’) rapidly diminishes. 

The surprising fact is that particles are able to “ tunnel ” through the 
barrier although their kinetic energy is not great enough to allow them to 
pass it. Classically speaking, the kinetic energy of a particle would be 
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negative while it is in region 2. Quantum mechanically, this statement is 
devoid of meaning, since it is improper to compute Æ — V for this region 
alone. 

Fig. 3 gives a qualitative plot of the (real part of the) ¥-function in 
the three regions here considered. It is seen that the barrier attenuates 
the wave coming from the left, permitting a fraction of its amplitude to 
pass out ata. The situation is quite analogous to the passage of a wave 
through an absorbing layer. 

11.11. Simple Harmonic Oscillator.—The potential energy, usually 
expressed in the form $42, is 4mw*x? when written in terms of the mass m 
and the classical frequency w = 2rv of the oscillator. The meaning of w is 


n More complicated barriers are discussed by Condon, E. U., Rev. Mod. Phys. 8, 
43 (1931); Eckart, C., Phys. Rev. 35, 1303 (1930). 
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simply that of a parameter appearing in V; we must no longer expect the 
oscillator to go back and forth w/2r times per second. The Sehrödinger 
equation is 


dy 2,2 
-a + (e g'a? = 0 (11-39) 
if we use the abbreviations 
_2mE me 
nn nnn) 


The substitution £ = Be reduces (39) to the form of the differential 
equation for “ Hermite’s orthogonal functions,” 


gpeg- 


which was studied in Chapter 2 (cf. eq. 2-66). It was there found that its 
solution is of the form ef H(t), H(&) being a solution of Hermite’s 
equation (2-62). Now H (£) is a polynomial if the quantity a, which corre- 
sponds to the present $(e/8 ~ 1), isan integer. Unless this is true, H isa 
superposition of the infinite Sequences (2-63) and (2-64). But both of 
these approach infinity like e, as closer inspection will show. If they are 
multiplied by e* 72 , they will not yield a ¥-function which has an integra- 
ble square between ‘the limits — œ and + œ, which we are here assuming 
to exist. Hence H (£) must be chosen in its polynomial form, Ha (È). 
Also, $(e/8 — 1) = n, and this leads to 


= (n + bho = (n + hy (11-40) 
Yn = ce~ PRH (Bx) (11-41) 


If the oscillator has three degrees of freedom, the Schrédinger equation is 
Vp + (e-e = 0 


when the same abbreviations as above are used. The method of separa- 
tion of variables (Chapter 7) which involves the substitution of X(z)- 
Y (y) - Z(z) for y at once reduces this partial differential equation to three 
ordinary ones 


X” + (a — 6a*)K =0, Y” 4+ le- Py*)¥ = 0 
Z + le — B72")Z = 0 


provided that e, + e2 + e3 = e Each of these has a solution of the form 
(41), so that 


Ynnn, = 067 H p (V Br) - Ha, (V By) Hn (VBE) (1142) 
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and 

Ennn, = (My H Mg Ng + Dhw (11-423) 
The orthogonality of the functions (41) has been proved in eq. 3-92. 
From this formula, the normalizing constant c may also be computed. 


For if 


| 
w 
į 
Ka 
te 
f > 
Se 
q 
a 
we 
y 
ay 
= tS 
Z 
Lr 
a 
a 
Arr 


f eT TPN Boyde = 


then 


A similar computation, which involves three integrations, yields for the 
constant c of eq. (42) the value 


N 
( ) (ni! nal ng! gu tretra 1/2 


Kig 


Further mathematical details concerning the functions here encoun- 
tered, as well as a table of the /7,-polynomials, are given in sec. 3.10. 


Problem. The treatment above implied that the 3-diinensional oscillator was 
isotropic; i.e, bound with equal forces in all directions. Calculate eigenvalues and 
eigenfunctions for an anisotropic oscillator with potential energy 


a 2o» 
V = dintwhe? + why? + wsz) 


11.12. Rigid Rotator, Eigenvalues and Eigenfunctions of Z7.—A rigid 
rotator is a pair of point masses held together by a rigid, inflexible and 
inextensible (massless) bond. A diatomic molecule is a fair approxima- 
tion to a rigid rotator. Before attempting to solve the Schrodinger equa- 
tion for such a system it is well to digress briefly and consider the cigen- 
value equation for an operator which so far we have not, introduced, 
but which is easily constructed. We have seen that the operators corre- 
sponding to the components of angular momentum of a particle are 


L -al a a) 
ot Yz Jy 


d 
Ly = ~it (2 -z 2) (11-44) 
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From these, we wish to construct the operator 
4 2 z 2 
B= L+H -+L (11-45) 
It is advantageous to do this in polar (spherical) coordinates.* Putting 
z = rsin ô cosg, y = 7 sin smg, 2 = r cos 6, we have 


ô on 6 ð 4 1 P ð ising ð 
— = sin ĝ cos y — + — cos l cos g — — - 
Ox ? ôr r ? 96 r sin @ de 


in @ sin 9s oso sin ð _ Loose ð 
= sin @sin g— + -~ S — -> 
Par y P3 ' + sind dp 


Jy 

3 = ¢ a2 * sin gå 
2 So l mng 
de. ar 38 


When these results are introduced in (44) and (45) is formed, there results 
of 1 ð ð 1 3) 
L? = ~H? g(s a2) ms OT 11-46 
Sn 8 30X88 + sin? 8 dv"{ ( ) 
The observable values which the square of the angular momentum may 
assume are the eigenvalues p of the equation 


BY = of (10-47) 


This equation is easily solved by the method of separation of variables 
(cf. Chapter 7). Clearly, y is a function of f and » Puty = 6(@)-@(~) 
into (47). This equation will then break up into two ordinary equations 
(the process is analogous to the construction of eqs. 7-42a and 7—42bþb): 


w= (in on - Gao +B Fel =o 


The quantity m must be an integer to insure single-valuedness of ©. 
The second equation therefore has the solution @ = const. e°, m an 
integer. The first is the equation for associated Legendre functions, 
(eq. 7.42b), except that the constant I(t + 1) appearing there is here 
replaced by p/h”. The solution previously obtained is 


prl 


d 
= gin” P 8 
0 = smn 07 (cos 0)” 1 (cos 8) 


Now the Legendre function P; was shown to behave singularly at 
cos @ = +1 unless / is an integer, in fact it would contain unlimited powers 
of z(= cos 8). The same would be true for 6 if ? were arbitrary. But in 


* See also the problem at the end of this section. 
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that case f y*ydr, which contains the factor 


x 1 
f Ə? sin 8de = f "dr 
0 -1 
would certainly not exist. We conclude, therefore, that / must be an 
integer, and that the eigenvalues of L’ are 
=+ 1h? 


On the other hand, the eigenfunctions of L? are of the form 


m 


de” 


sin” @—— Pi (cos Ae"? = PP (cos Oje? (11-48) 
in the notation adopted in Chapter 3 (ef. eq. 3-43). Since the eigenvalue p 
does not depend on m but only on /, functions like (48) with different m 
will satisfy eq. (47). The most general solution of that equation is there- 


fore, !? , 


y= L one (eos &)e”? (11-49) 


m= = 


In Chapter 7 this function has already been encountered; it is called a 
spherical harmonic and denoted by ¥7(@,e) (ef. eq. 7-43 et seq.). Hence 


Y = Vi6y) (11-50) 


Since dr = sin &/édy, normalization requires that 


w Qe 
f sin dð f dep ty = | 
0 o 


When (49) is inserted the integral becomes 
i 1 i+ m)! 
2 * mf P(x 2d: = 2 (+m)! 
rE chen | (ere) Pax ankles Toa 


(ef. eq. 3-62). Hence, for normalization, the constants cm appearing in 
(49) must satisfy the relation 


l 
$ | en p EEL 2+1 
ma (I~ m)! An 
and are otherwise arbitrary. 
We are now ready to return to the problem of the rigid rotator. In 
the first place, we shall assume it proper to replace it by a single mass, 
rigidly tied to a center of rotation, and having the same moment of inertia 


12 We define here and elsewhere: P7” = PP, asin (-62). 
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as the original system. The condition upon the state function in accord 
with this assumption—aside from single-valuedness—is simply 7 = & 
a constant. The best procedure is therefore to write down the Schro- 
dinger equation for a particle moving in three dimensions, and then to 
put r = a, d¥/dr = 0. This requires the use of polar (spherical) coordi- 
nates. The potential energy V, in this case, is clearly constant and may 
be taken to be zero. , 
Schrédinger’s equation reads* (cf. Chapter 5 for transformation of V“) 


19/,0 1 af. H) 1 oy 2M r 
8 sos ot SS EV = 90 11-51) 
7 or (- *) + Fein 38 (sin 36) Anoo wee ( 


When r is put equal to a the first term on the left vanishes, and the 
remainder becomes very similar to L°p. Indeed if we introduce, a new 
operator A? defined as (1/A7)L’, eq. (51) may be written 
2M a? 

A? 


Ay = Ey (11-52) 
But the eigenvalues of A? are obviously ¿(l + 1), and its eigenfunc- 

tions are-the same as those of L?. The constant (2Ma?/h7)H, must be 

identified with 1(i + 1). Hence the eigenvalues and eigenfunctions are 


#2 


E= 2M a? 


L@ +1); vam = Yr) (11-53) 


Problem. Show by vector algebra that 
ð a? 
—A? = (r X V)? = —rv? + 2r + 
Hint: Note that (r X V)? = r-([V X (r XV)]. Then use (4-26) for V X U Xx V. 


11.13. Motion in a Central Field.—By central field is meant a field of 
force in which the potential energy is a function of r only; V is independent 
of 6 and y. The isotropic three-dimensional oscillator treated in see. }1 
is an example of motion in a central field. Another is the motion of a 
particle in a Coulomb field. It is to this last example, an electron attracted 
by a positive point charge (hydrogen atom), that we shall chiefly direct our 
attention. But before considering this specific case a few general features 
of the central field problem will be exposed. 

It is now clear that the Laplacian, V7, in spherical polar coordinates 
has the form 

2_1/9f 29 2 
y = 3{2(7 P +a? Gi-5d4) 


* To avoid confusion, we write M for the electron mass in this section, returaing 
to the symbol m in the next. 


11.13 QUANTUM MECHANICS 364 


where A? is given by (46) divided by —A°. The eigenvalues of A? are 
L+ 1). The Schrödinger equation therefore reads 


Lf a > ow 2,| , 2m 
sin lr a} ~ [E -y =0 11-55) 
2 (r | A ii + = Í (r) Wy ( } 
We write y as a product of a function R(r) and another, 4 (8), which 


depends only on the angles. The operator A? acts only on A. Eq. (55), 
after multiplication by r° and subsequent division by R- A, has the form 


(e2 
dN d) 2mr? AA 


z tg E-V@l=— (11-56) 


The left-hand side of this equation is a function of r alone, the right a 
function of 0 and œ. By the argument which is familiar from Chapter 7, 
each side must be a constant, say a. Thus 


AA = aA 
But this is simply the eigenvalue equation for A. We see, then, that 
a=ld+1), and A = Y0) 
The left-hand side of (56) becomes 


df aR om | ld +1) | 
_ — 2| R=0 (115 
T E Fm | + z2 E — V(r) Ime RiR=0 ( 7a) 


and the substitution U (r) = rR (r) reduces this to 


2m 


U” + z3 


L+ 1A? 
E —V@r)- Ki ) |e =0 (11-57b) 
2mr* 
The development so far has been totally independent of the form of V, 
except in assuming it to bea function of ralone. The results obtained are 
therefore valid for any central field. Summarizing them, we may say: 


The energy states of a particle in a central field are always of the form 
1 
v= ; Uilr) ¥:4,¢) 


and the function Uzis determined by eq. (57b). It was necessary to adda 
subscript l to U because the differential equation contains | as a parame- 
ter. The energies Ẹ are obtained solely from eq. (57b). 

That equation looks very much like the one-dimensional Schrödinger 
equation, 


y 4 mB vim =0 (11-58) 
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but with the term I(l + 1)h?/2mr? added to the normal potential energy. 
What is the meaning of that term? In classical mechanies, the energy of a 
particle moving in three dimensions differs from that of a one-dimensional 
particle by the kinetic energy of rotation, mrw”. This is preeisely the 
quantity I(l + 1)7/2mr?, for we have seen that L(I + 1)4? is the certain 
value of the square of the angular momentum for the state Y;, in classical 
language (mr’w)”, which when divided by 2mr?, gives exactly the kinetic 
energy of rotation. 

There is, however, one further difference between (57b) and (58). 
The fundamental range of r in (57b) starts at r = 0 and is limited to 
positive values, whereas the range of z in (58) may include negative values. 
This fact often has a more important effect on the eigenvalues than the 
addition of the terms just mentioned. 

Let us now solve eq. (57b), assuming a Coulomb field, e.g., V (r) = —e?/r. 
The energies Æ will then be the energy levels of the hydrogen atom.'* For 
sufficiently large r the solution is determined by 


a 2 
U” — Q U=0 (11-59) 
provided we define 
2 mE 
6) =~ (11-60) 


The solution of (59) is Us = ae” + ee? and this represents 
the behavior of the correct U at œ. Let us first suppose that a is real, 
which means that the energy of the particle is negative. U will then cer- 
tainly not have an integrable square (note that the radial integral has the 


form f Rrdr = f U*dr) if the coefficient c fails to vanish. But we 
o 


cannot simply put it equal to zero because we have boundary conditions to 
fulfill! Without going further in our analysis at the moment we expect, 
therefore, that only special values of œ will produce acceptable solutions 
when awisreal. If the total energy of the particle is negative (classically 
speaking, the particle is bound to the attracting center), the energy is 
expected to be quantized. The following analysis will bear this out. 

H a is imaginary, which means that Æ is positive, U shows sinusoidal 
behavior. It has, in fact, the typical form of the state function for a free 
particle, and the failure of normalization oceurs in the milder manner 
which we have previously found associated with the presence of a continu- 
ous spectrum of eigenvalues. There is indeed no way of choosing c, or c2 


13 Tf e? ig replaced by Ze?, Z = 2 represents ionized helium, Z = 3 doubly ionized 
lithium, ets. 
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or a which would make one Uas more acceptable than another. We con- 
clude that, when Æ is positive, the energy spectrum. is continuous. 

From the point of view of classical physics this result is welcome, for 
when E is positive the particle is ionized and moves through space, its 
energy being unrestricted. 

We now discuss the bound states in a more rigorous manner. Put 
E = —W, so that W is positive. Our interest will now return to eq. (57a) 
which forms a more suitable basis for the present discussion. Letr = z/a, 
where a is defined by (60). Ea. (57a) then reads, after some cancellation, 

R dR 2m? z +!) 
E k 4 az |e =o 


(11-61) 


But this is precisely the differential equation for associated Laguerre 
functions, which was studied in Chapter 2 (ef. eq. 71). For our immediate 
purpose we shail write that equation with n* in place of n, since otherwise 
our notation would be in conflict with physical convention. To summarize 
the results of sec. 2.16: 

The equation 


k-1 z kK-J1 


has a solution possessing an integrable square!* of the form 
y = e7 rD Lk (2) (11-63) 


provided n* and k are positive integers. Moreover, n* — k 2 0 since 
otherwise L*, would vanish. 

On comparing (61) and (62) we find, in the first place, that (k? — 1)/4 
= (l + 1), hence 


k=2+1 
Secondly, 


2 hr 
When the value of a is inserted here and the relation is solved for W, 
we find 
1 me* 
W=35 (n* — 17h? 


Because of the conditions on n* and k; the quantity n* — l cannot be 
zero. It is usually denoted by n and called the total quantum number 
(after the rôle it played in the Bohr theory). Our conclusion, then, is this: 


H The reader should convince himself of this fact by going back to sec. 2.16. 
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The energy states of the hydrogen atom are 


1 me 
a= En = a E 11-64 
w 2 n*h? ( ) 
and the corresponding eigenfunctions are, in accordance with (63), 
Raw = enue LZY (x) (11-65) 
the variable x being defined by 
Vv 8mW Qme* 
c= ar = r = r 
A ni” 


In the Bohr theory of hydrogen, the first orbit has a radius 
K? 
a = —3 = 0.53 X 10-8 em. 
me” 


It is sometimes convenient to express z in terms of it. Thus a = 2/nao, 
and 


rai (11-66) 


It is to be noticed that z represents a different variable for each energy 
state; the quantum number n determining W appears as a scale factor in 
the dimensionless variable x. 

Some integrals involving Æna „u, which occur frequently in physical and 
chemical problems, have been evaluated in sec. 3.11. See also the example 
at the end of sec. 3.11, which is of interest in this connection. 

For later use, we write down in explicit form the state function for the 
normal hydrogen atom. It is 


Ryo = C1,0¢ 7 Ly = Qay eT 


For this state Y; = constant = (4r) ™? when the function is nor- 
malized. Hence the total ground state function is 


bo = (wap) eT (11-67) 


y-functions for the higher states are listed in explicit form in Pauling and 
Wilson. !5 
When the charge on the nucleus is not e but Ze, ag must be replaced by 
a)/Z, so that 
VANE 
bo = (4) e77% (11-67a) 
Tag 
15 Pauling, L., and Wilson, E. B., Jr, “ Introduction to Quantum Mechanics,” 
McGraw-Hill Book Co., 1935. 
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Problem a. Using the results of Chapter 3, show that the normalizing factor in 


(65) is 
Co escent 
Cnt = nao Qnf(n + 1) 18 


Problem b. Work out the problem of the isotropic oscillator using spherical co- 
ordinates, and show that the results agree with those obtained in (42) and (43). 


11.14. Symmetrical Top.—In dealing with the problem of a rotating 
rigid body attention must be given to the kinetic energy operator. To 
obtain it we first observe that its form in rectangular coordinates, for the 
n particle problem (cf. sec. 11.31) is 


The position of a rigid body is best expressed in terms of the Eulerian 
angles, introduced in sec. 9.5. It wag there shown that the classical 
kinetic energy is given by 


Te =} omit tz) 
= $A? + £Ad? sin? 8 + 3C(4 + & cos 8)? 


Let us define a line element constructed from the Cartesian coordinates 


= V stir ni = V mis; = V mizi 


as follows; 
= E G+ ai + att) (11-68) 


This is clearly identical with 27.dť?. From the form of T, in Eulerian 

coordinates it is seen that ds” in these coordinates is given by 
ds? = Ad? + A sin? gda? + C(dy + cos pda)? (11-69) 

Now the quantum mechanical form of T is the Laplacian operator corre- 
sponding to the line element ds”, multiplied by —h?/2. The problem is 
therefore to transform the Laplacian operator from a set of coordinates in 
terms of which the line element is given by (68), to a new set in terms of 
which the line element is (69). 

This problem has been discussed in sec. 5.17. If 


= $ gudadqu 
Ae 
then 
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On identifying the gn, from (69) we find (putting qı = B, @ =a, 
q3 = 7) 


A 0 0 
im =(0 A sin? 8 + C cos’ 8 oeme) 


0 C cos 8 C 
and hence 
1 
a ° 0 
1 cos 8 
he = 0 - _ = 2 Y ot 2 
(o™) A sin? B Asin? gp I A Csin 6 
cosg 1 cos? 8 


A sin? B Ct Asing 
When these results are substituted in the expression for Viv we have 


K a, A? {ay /sin B dy 
m= - Foy =~ 55 aS *) 


ral sin dy sin 6 cos 6B #] 
da 


A sin? Bda Asin? B dy 


4 ð |- sin 8 cos 8 ay + (= B + sin 8 cos? 5) a 


ay Asin? B ða C Asin? 8 /dy 
A? ary oy 1 ay 
= ~ ba laa + OF oe T sin? g aa? 


2 a) 24 — Zee ay | 
+ (cot Br C)ay sin? B dady| 


Since the potential energy in this problem is zero, the Schrédinger 
equation becomes 


Ty = Ey 
It is separable; for if we put 
y = ula) -v(y) - w(B) 
the functions u and v are seen to satisfy equations of the form 
Ru 


du dy dy 
az a ta z + au = 0, be ga + rg, + bor = 0 


where the coefficients ao, a), az are not functions of a, and the coefficients 
bo, bı, b2, are not functions of y. Such equations have solutions 


u = eme, ps ett 
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m and k being roots of algebraic quadratic equations involving the coeffi- 
cients a and b. However, these need not be solved here, since the con- 
dition of single-valuedness dictates that m and k be integers. We therefore 


put . 
“= eiMa v= eiky 


M,K = 0, +1, +2, ete. 
The Schrödinger equation now reduces to the following ordinary 
differential equation in the independent variable 8: 


w + cot Bw’ 
M? 2 A\ 2 cos 6 2A 
— = — : M — -E = 
| + (cot e+4)K 2m a ye E w 0 


sin? 8 


The substitutions 
4(1 — cos 8B) = x 


w(8) = gl ETM — x) EEMI P(g) 
which are suggested when this equation is examined for its singularities 
along the lines of Chapter 2, transform it to 
dF dF 
(a? — 2) Fat [U + pt -aT OP +n)F = 0 


the new parameters being defined as follows: 
14+|K-M|+|K+M| 


p=- 
g=1+|K-M| 
E 2 
np tn = A(t E-e- -e-n 


This last relation, when rearranged, may be written 


#2 ‘ —_ 
aC EGE] am 


Reference to Chapter 2, eq. 56 will show at once that the differential 
equation for F is none other than the familiar hypergeometric equation 
defining the Jacobi polynomials, provided n is an integer. Unless this 
condition is satisfied, F will diverge for x = 1, i.e., for 8 = x. 

Eq. (70) takes a simpler form when we introduce the new quantum 
number 


z n+} K-M|+HK+M| 


J=n+? 
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which is evidently a positive integer or zero. We then obtain 


i? A 
e-par- je] 


an equation which determines the energy levels of the symmetrical top. 
Note that the quantity 3] K — M | + $| K + M | is equal to the larger 
of the two integers K and M; in consequence of this neither | K | nor | M | 
can be greater than J. 

The energy levels of the spherical top (A = C) are those already 
obtained in sec. 11.12 (cf. eq. 11-53). 


MATRIX MECHANICS 


11.15. General Remarks and Procedure.—The formulation of quantum 
mechanics we have given in the foregoing sections was historically preceded 
by Heisenberg’s matrix theory. The latter, while it appears at first 
glance to be an altogether different mathematical structure, strikingly 
produced the same results as the former. But when the initial amazement 
subsided both formulations were recognized as equivalent. In the present 
text the Schrédinger-Dirac theory was discussed first because its axioms 
seem perhaps less strange, and because its point of view has been more 
widely adopted. The terminology of matrix mechanics, however, enjoys 
great popularity and is often conducive to clarity of expression. 

It iş possible, and perhaps pedagogically worth while, to derive Heisen- 
berg’s theory from the postulates of part of this chapter. But when this 
is done, the impressive element of uniqueness which attaches to matrix 
mechanics is completely lost. To preserve it we proceed to state the basic 
facts of the theory first, to give an example of its application, and then to 
exhibit ita relation to the preceding developments. We can afford to be- 
brief, for when the equivalence of the two theories is once established, no 
new insight is likely to be gained by deducing former results over again in a 
different manner. As before, attention will be limited to what we have 
called quantum statics. The principal facts of Chapter 10 will be used. 

Heisenberg associates with every observable a square Hermitian matriz. 
As in the Schrédinger theory, one of the chief concerns of matrix mechanics 
is the determination of the measurable values of an observable. Let it be 
desired to find the observable values of a quantity H, which, classi- 
cally, is a function of the Cartesian coordinates q; and momenta pa 
H = Hq t*a; Pitt Pn). In our example we shall specify H to be 
the energy, but this restriction is not necessary. Heisenberg’s directions 
are these: 
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Find aset of matrices Qi, Qo,---, Qa; Pi, Pots Pa which (a) satisfy 
the commutation rules 


QnQ@n — OnQm = O; PaPa —P,P,, = O; Paa ~ Qam = isnt 
(11-71) 
where Æ is the unit matrix; (b) render the matrix 


H(Qi-- Qa; Pi Pa) diagonal (11-72) 


By H(Qi--- Qn; Pi- Pn) is meant, of course, the matrix which ie 
the same function of the matrices Q, --- P, that the ordinary function H 
is of qı <- - pn. The existence of the matrix H and its uniqueness will be 
assumed. When such a set of matrices has been found, the diagonal 
elements of H will be the measurable values in question. (It is also true that 
the squares of the absolute values of the elements (Q:),, are simply related 
to spectroscopic transition probabilities, as will be shown later; but this 
does not concern us here.) We illustrate the power of the method by an 
example. 

11.16. Simple Harmonic Oscillator—The Hamiltonian function is 
(ef. sec. 11) 


p? 1, 2 
H = 5— + mg? 


Hence, if P and @ are matrices, 
1 4 
H(Q,P) = = (P? + ma Q’) 
2m 


The straightforward way of working this problem would be to select a 
set of matrices such as, e.g., 


Ory = Oal Pry = — iuby a (11-73) 
which satisfy the commutation rule (71): l 
(PR)u — (QP) = — ihr (11-71a) 


as the reader may verify. These must then be subjected to a similarity 
transformation with some other matrix, say S, until the new matrices 


Q’ = SOS, P’ = SPS 


when substituted in H, make H a diagonal matrix. (Cf. Chap. 10.) This 
procedure, however, is usually very cumbersome and is rarely used. The 
success of the matrix method depends frequently on fortunate guesses or 
on specific properties of the Hamiltonian. In the present instance the 
following considerations lead most directly to a solution of the problem. 
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Suppose that the matrices P and Q, which satisfy (71a) and make H 
diagonal, have already been found. Then 


1 9 9 
Hy = am (P? + ma Oe = ES (11-74) 
provided we write K} for the diagonal elements of H. 
Now let A= P — imuQ 
and 
B = P + imaQ. 
Then, because of eq. (71a), 
AB = 2mH + mohl (11-75) 
and 
BA = 2m — mohl (11-76) 


Now form ABA from (75) and (76): 
AQmH — mohl) = (2mH + mohf)A 
EA (Erin — folds) = E (Erda + bohr) Ar; 
A 


Ari(E;j — Ek — wñ) = 0. 


Hence Ap; vanishes unless 
E; — Ey = ñw 


Next, form BAB from (75) and (76): 
BQmH + moht) = QnH — moħñI)B 
E Ba (Erb; + fohi) = L (Epde. — bahin) Bas 
A A 
Byj(Ej; — Er + oh) =0 or, by changing the subscripts, 
By (Ey, — E; + oh) = 0 


Hence B;, vanishes unless 


E; — By = oh 
Now take a diagonal element of eq. (76): 


(BA);; = 2m(E; — $hw) (11-77) 
But l 


(BA); = 2 BirArij. 


Each term in the summation over k vanishes except the one for which 
E, = Ej — wh. Suppose E; is given. Then either 


Es = Hy; — wh 
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is another eigenvalue, in which case the right side of eg. (77) is finite. 
Or there is no eigenvalue which is less than Ej by hw. Then the right side 
of (77) is zero and 
E; = bho 

This must be the lowest eigenvalue. From this analysis we may conclude 
that the sequence of eigenvalues is 

Zo, Bho, hw, ete. 
‘jn agreement with the results of sec. 11. 

11.17. Equivalence of Operator and Matrix Methods.— We first estab- 
lish a theorem of great importance in quantum mechanics. Consider a 
differential, Hermitian operator L of the kind discussed in see. 4, which 
generates, through the eigenvalue equation 


Lẹ: = lid: 


a complete set of orthonormal functions ¢;. Whether ¢; is a function of 
one or many coordinates is unimportant in this connection. If we intro- 
duce other operators M, N which act on the same variables as L we can 
clearly form two square arrays of numbers, i.e., matrices, by the rule: 


My = | sMojdr, Ny = f Nejd (11-78) 


dr being the element of configuration space of the variables of ¢. The 
theorem asserts that equations which hold between the operators M and N, 
also hold between the matrices formed by the rule (78). To prove this it is 
necessary only to establish this parallelism for the two fundamental opera- 
tions, addition and multiplication: 


(M +N) = Mi; + Ni; (11-79) 
(MN); = EM aN; (11-80) 


The first of these is at once evident from (78). To prove the second, 
let us expand the function Ngy in terms of the ¢; themselves: 


Ng; = Zarr (11-81) 

By the general procedure of finding the expansion coefficients, !® 
ay = f aNosdr = Ny (11-82) 
The left side of (80) is, by definition, f t MNġ;dr. On using (81) 


1° Multiply the equation by ¢f and integrate, using the orthogonality of the gi. 
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and (82), this becomes (MN );; = J Ma jar = Z f H Medrayy = 
à à 
2! ,N);, in accord with eq. (80). 
x 
If, then, we wish to form matrices satisfying relations like (71) or (71a), 
we need only find operators which conform to them, select an ortho- 


normal set of functions 4 and construct the matrices by means of the rule 
. (78). 


Problem a. The operators Q = e”, and P = ~he-*(d/dz) satisfy 
PQ — QP = -il 


Use the functions ¢: = ett k = 0, a1, +2, --- to construct the matrices Px, 


3r 
Qui. They will be found to be identical with those given in eq. (73). 
Problem b. Construct the matrices Xam and Pam, using X = z, P = —ih(d/dz), 
and taking as the orthonormal set the normalized Hermite orthogonal functions dis- 
cussed in Chapter 3. Note that n and m can only be 0 or positive. 


Ans. Xam = V {n + 1) / bòm, nti +y 2/ 288m n1; 
Pam = thp (n — m) Xam. [8 is defined after eq. (39).] 
Show that these matrices satisfy (71a). 
It is interesting to note here that a Hermitian operator, defined by 
eq. (15), generates a Hermitian matrix (cf. sec. 11.10). For 


f Psdr = f oPretar 


simply means 


in our present notation. 

The suecess of Heisenberg’s directions is now easily understood. The 
differential operators which obey relations analogous to those prescribed 
for Heisenberg’s matrices (71) are 

„ô 
Qn = Qm; Ph = -h — 
OGm 
in other words precisely the former, Schrödinger operators.” Suppose we 
select an orthonormal set of functions, ¢,, belonging to the operator L, and 
construct 


(Qm) = f $2 Qmojdr, ete. 


17 The fact that there are also others, like the ones considered in problem a, need 
not disturb us here. The Schrödinger equation which results when they are used 
appears different, to be sure, but reduces to its familiar form when a change of variable 
is made. 
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When these matrices are substituted into the functional form A the result 
is the same as if we had at once formed 


Ay; = f stitesas 


as follows from the theorem we have proved. But the only condition 
under which this matrix can be diagonal is 


Ho; = const. $j (11-83) 


that is to say, the ¢-functions must be chosen to be eigenfunctions of the 

Hamiltonian H. The problem of making the matrix H diagonal is equiva- 

lent to selecting the proper ¢;, i.e., to solving the Schrödinger equation. 

To see that the diagonal elements of H are the permissible energies E; of 

the former theory, we need only substitute Hg; = E;ġ; into (83), obtaining 
Ay = Ep; 

It is easy to extend the Heisenberg theory beyond the limits of the 
present development. The second postulate, eq. (8), is valid if P is 
interpreted as a matriz and y as a vector. In the terminology of Chapter 
10, the y, are then the eigenvectors of the matrix P, and the py are ite 
eigenvalues. The relation of the eigenvectors to the state functions is not 


difficult to see. Suppose we choose a basic orthonormal set of functions, 
$: Expand the eigenfunction y; appearing in the operator equation 


Ph: = pik (11-34) 
in terms of them, viz., Y: = Laag 
` 
Now multiply (84) by ¢* and integrate. We find immediately 
EP Pan = Pilij 


and conclude that the eigenvector wp, has as components the coefficients 
which appear in its expansion in terms of the basic ¢. More explicitly, 


The last equation then reads (Py); = P;(y.);. If the basic set is identical 
with the eigenfunctions of the operator P, the eigenvector has only one 
non-vanishing component. 

Finally, even the third postulate, (14), may be retained in the Heisen- 
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berg theory if its form is suitably changed. We interpret ¢ as a vector $ 

with components a, the a; being the coefficients in the expansion of the 

function ¢ = Zap, in terms of our basic 6; (¢ without subseript here 
A 


denotes an arbitrary state function, not necessarily one of the set ¢;), but 
¢* not as the complex conjugate, but the associate vector: 


gt = (afažaš -- -) 


P represents the matrix Pi; = J ¢fPojdr. Eq. (14) must then be 
modified to 
p= Po 
which reads, when written more explicitly, 
P = Lax Pruty 
Me . 

When the ¢; are taken to be the eigenstates of the operator P, the 
matrix P becomes diagonal, and B = Liaxa,P,, which is the same relation 
as was found in the Schrodinger theory under these conditions. 


Problem. Calculate the integral 
f we Hy, (£2) Hm (ad 


by the methods of matrix mechanics. Let dn = cne” Hn l2), bm = tme Hm (x), 
where cn, €m are normalizing factors, and note that, aside from normalizing factors, the 
integral is the matrix element (2")am. Now tap is given by eq. (3-93); this may be used 
in calculating 
("am = L Ln dCrutyv *** Lent 
uw 


APPROXIMATION METHODS FOR SOLVING EIGENVALUE PROBLEMS 


11.18. Variational (Ritz) Method.—In Chapter 8 we showed that the 
differential equation L(u) + wu = (pu’)’ — qu + Awu = 0 is the neces- 
sary (though not sufficient!) condition upon u if it is to minimize the inte- 


gral A(u) = f (pu’? + qu?)dx. Furthermore, it was seen that A(u) could 


be transformed (ef. eg. 8-37) by simple steps to — f uL(u)dz. The 


theory in this simple form is applicable to every one-dimensional 
Schrédinger equation, for in that case the Hamiltonian operator 
H = —(#2/2m) (d?/de*) + V (£) is of the form —L if only we identify p 
with #2/2m and g with V. Hence we may at once say that the Schrödinger 
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equation is the necessary condition upon y so that the integral 


f yHydr 


shall be a minimum. The one-dimensional variation theory may also be 
applied, though in a somewhat more cumbersome manner, to every ordi- 
nary differential equation to which the multi-dimensional Schrödinger 
equation gives rise on separation of variables. It is possible, however, to 
prove a far more general theorem which is of utmost utility in numerous 
problems of applied mathematics, a theorem of which the former statement 
is a special case. 

Let P be a Hermitian operator. We wish to find the normalized func- 
tion Ņ which will make the integral 


J y*Pýdr 


a minimum, The integration extends, as usual, over configuration space, 
and we shall assume for the sake of definiteness that r is a finite portion of 
configuration space. Certainly, the necessary condition upon y is that 


5 | f y*Pýdr ~ d f vyd 


shall vanish; à is an undetermined (Lagrangian) multiplier (cf. sec. 6.5). 
Now the variation symbol and the integral sign are commutable in this 
expression because the limits of the integration are supposed finite and 
fixed. Hence we have 


f By* - Pydr + Jf v*-8(Pp)dr =) f By" -ydr — 2 f y*ivdr= 0 (11-85) 


The second integral in this expression may be transformed in two steps. 
First, 6(Py) may be replaced by P (êp) since the operator P suffers no varia- 
tion. Second, because P is Hermitian and both y and dy are acceptable 


functions, f Y*P (èp )dr = f èb - P*/*dr. Eq. (85) therefore reads 


f by*(Py — W)dr + f by (P* — dy*)dr = 0 (11-86) 


Here dy is an entirely arbitrary function. Let us take it to be real, so that 
éy* = dy. Eq. (86) can then be satisfied only if 


Py — Ny + P¥y* — ry" = 0 
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On the other hand, if we take éy to be imaginary, so that 5f* =e = ôy, 
we conclude 


Pe — ry — Pty* + y= 0 
Addition of the last two equations yields 


Py = N} 
subtraction gives 
P*y* = r~y* 
We have shown that, if 
5 f VPYdr = 0 (11-87) 


for normalized y, this function must satisfy the eigenvalue equation 

Py = N} (11-88) 
which also automatically determines 4. Whether, when (88) is satisfied, 
the minimum of, or indeed the integral, f Y*Pydr, actually exists, is a 


point we have not investigated. It is customary in physics not to worry 
about these eventualities, for they are difficult to discuss. The mathe- 
matical equivalence of the minimal property of the integral and eq. (88) is 
usually taken as a matter of faith. 


If y satisfies eq. (88), then f y*Pydr =. From what has been said 


it follows, therefore, that the integral f ¢*Pedr computed with a function 


different from the minimizing y, cannot be smaller than 4. But here a 
slight complication arises, for there are many eigenvalues à. All that we 
can really say is that for a function ¢ in the “neighborhood ” of ys, the 
integral will not be greater than A;. Certainly, however, 


f Pedr = Xo (11-89) 


if g is any analytic and continuous function’® and Xo the lowest eigenvalue. 
The Ritz method,!® named after its inventor, is a systematic procedure, 
based upon the foregoing variational considerations, for solving the eigen- 
value equation (88) by substituting into the integral in (87) a suitable 
sequence of functions which causes the integral to converge upon the 
value à. Instead of presenting the method in its original form, we shall 
18 Restriction to functions with a certain number of derivatives is necessary becaug 


P is in general a differential operator, and Py must have meaning. 
19 Ritz, W., J.J. reine und angew. Math. 186, 1 (1909); Courant-Hilbert, p. 150. 
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here work out some of its features in a manner more directly adapted to 
the needs of quantum mechanics, and with a slight loss of rigor. We are 
usually interested in finding the energies, particularly the lowest (normal 
state) energy of physical or chemical systems, hence we identify at once 
the operator P in (89) with the Hamiltonian H. 

The simplest way of finding an approximation to the lowest energy of a 
system is to use (89) directly. Sometimes a good guess can be made as to 
the general form of the true state function y, a form which may allow the 
inclusion of one or more arbitrary parameters. The integral in (89) is 
then computed with this function, and the result is minimized with respect 
to the parameters. An example will clarify the method. 

11.19. Example: Normal State of the Helium Atom.—The helium 
atom consists of two electrons moving in the field of a nucleus of charge 2e 
and at the same time repelling each other. We consider the nucleus as 
stationary and denote the distances of the two electrons from it by ry 
and ry respectively; fig is the interelectronic distance. The potential 
energy is —2e7(1/r1 + 1/r2) + e?/ria, and the Schrödinger equation 

l A a, oe 2 ( 1 :) a 
Hy = i — — (V3 + v3) — 2e (— + —) +t = By, (11-00) 

2m fi T2 T12. 
A subscript on the symbol V indicates that the Laplacian is to be taken 
with respect to the coordinates labeled by the subscript. If the term 
e?/rı2 were absent eq. (60) would be separable, for then the operator H 
would be the sum of two helium-ion Hamiltonians, H = Hi + Ha, the 
first acting on the coordinates of electron 1, the second on those of elec- 

tron 2. But the equation 


(Ay + Ha)y = Ey 

may be separated on substitution of y = u(1)v(2), where u(I) stands for a 
function of the space coordinates of electron 1, and v(2) is defined similarly. 

For it becomes, after division by y, 
Hyu(1) , Hov(2) 

u(i) v(2) 
which indicates that Hiu(1) = Eyu(1); Hov(2) = Ew(2); Fi +E, = E. 
But the first two of these are simply Schrédinger equations for the singly 


charged helium ion, whose solutions we already know. (Cf. eq. 67a.) Since 
we wish to find the Lowest energy of our system, we identify the functions as 


follows: u y 
z3 I 2 mZri/a ( Vig 2 a Zra/aa 
ull) = (4) e v(2) = m a) e 


= E, a constant 


and y is the product of these. 


381 NORMAL STATE OF THE HELIUM ATOM 11.19 


The correct so:ution of eq. (90) is certainly not of this exact form 
because of the “interaction term” e?/ri2, whose effect on y one would 
expect to be very complicated indeed. Aside from other changes, it will 
cause y to depend on r12 explicitly. But from a physical point of view, the 
repulsion between the electrons will cause both of them to be, on the aver- 
age, farther away from the nucleus than if the repulsion were absent. - This 
would mean that the functions u and v are in error with respect to the scale 
factor Z/a. If this were smaller, a more extended probability distribu- 
tion would result. (For the helium ion Z = 2.) It would seem expedient, 
therefore, that we take as our “ trial ” function in the variational procedure 
the function ¢ = u(1)v(2) but with an undetermined Z. 

Tn calculating 


f oHodr (11-91) 


it is well to have available the differential equation whose solutions are u 
and v: 


2 


— gua) = u(y + ZE gut), oe) =u) (11-92 
pe Viel) = wul), o2) =u) (11-92) 


Here Ey is the energy of the normal hydrogen atom, Ey = —e*/2aq 
(= —13.53e. volts). The differential dr in (91) represents, of course, the 
product of the volume element for the two electrons. When H is taken 
from (90), we find, using (92) and the fact that u is normalized, - 


. io 2 

ff shoa = 22E g + (Z — ne f (= + = \ 2dr + e fia (11-98) 
rm T9 rhe 

The integral 


2 2 2 
f ar = f uO a. fen f OO Ban sin addo 
r ri 


ry 1 


is easily computed directly. It has, in fact, already been evaluated (cf. 


2 
sec. 3.11, example) and found to be Z/ao. The other integral, fear, 
f rg 


2 
has the same value. We leave the evaluation of e? £ dr for later; its 


rig 
value is —ZE y. Hence, eq. (93) becomes 


f Hodr = 2298 + (Z -2)-22 -Š ZEy 
= Z [27 —4(Z — 2) — 3) Eg 
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This expression is to be made as small as possible by choosing Z properly, 
ie., the coefficient must take its maximum value because En < 0. , Putting 
the derivative with respect to Z equal to zero, we find for the minimizing Z 


the value 27/16, which is somewhat less than 2 as we expected. Hence 


the best energy value attainable by adjusting Z in our function is 


imentally is 
Z(27/4 — 2Z)E g = 5.6095Eu. The energy found experimen 
rE H- The difference between these two values is to be ascribed to the 


defects of the simple trial function here chosen. 
Z 


Fra. 11-4 


A very interesting summary of the results of the present method as 
applied to helium is given by Pauling and Wilson.?° Their table shows how 
the value of the integral approaches the experimental energy as increasingly 
refined trial functions are used. 

To complete the analysis we indicate how the integral 


2 
t= {oa 
Tye 


may be computed. The method is typical of the evaluation of “ double 
volume ” integrals involving the variable r12, and hence perhaps of some 
interest. The volume element 


dr = ridry sin 6,d0,dy, - rždra sin bodðadpz 
may also be expressed as follows: (see Fig. 4) 
dr = ridr; sin 6d0,dgy - rigdri2 sin Ydydx 
20 Pauling and Wilson, p. 224. 
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Now f= r + rh — 2rırı2 cos Y, whence rodra = rırıg sin ydy provided 
rı and ry2 are held fixed. By means of this relation sin ydy may be elimi- 
nated from the last expression for dr, and we obtain 


dr = Tidrırodroriodriz sin 0 dô iderdx 


Substitute this volume element into J, and integrate at once over the 
angles, thus introducing the factors 2-2r-27. On using the abbrevia- 
tion œ = 2Z/ao, we obtain 


afe? 
I =y SITS e7 e142) m driredredri2 


The ranges of integration are: 0 < ry < ~,0 <re< %; |r- n] < 
riz < rı +72. The absolute value sign on the limit for r;2 forces us to split 
the integration over rz into two parts, (a) r2 >r, (b) re< rı. Inrange (a) 
the lower limit of rig is 72 — ry, in case (b) itis ri — rg. Thus 


aĉe? f æ% Ga ratri 
I = — J e7 nan f eTa: rodra f ary. + 
8 o ry re-Ti 
© 


œ Ti bre 
f eTa? rodre f er dry f driz | 
o r2 TL =r: 


Inspection shows that the two triple integrais are equal. The calcula- 
tion is now perfectly straightforward; it makes use of the formula 


f en rdr = p mrD ni 
o 
and leads to the result 
Z G = — Š on 
ao 4 a 

which was used above. 

11.20. The Method of Linear Variation Functions.—It is often con- 
venient to use as the trial function ¢ in Jf é*H¢dr a linear combination of 


definite functions u; which are judged suitable for the problem at hand. 
The coefficients appearing in the linear combination may then be treated as 
variable parameters Thus, assume 


$ = ZX axan (11-94) 
=k 
where the u’s need not form an orthonormal set. We define 
, f $*Hedr 
fotu = Aj, fetter = Ki; E = — ~ 
f *bdr 
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The symbol X; in place of H;; is to remind the reader of the fact that the 
matrix HC does not possess the simple properties of H because the former is 
not constructed with an orthonormal, complete. set of functions. The 
denominator in the expression for Æ is needed to normalize the function ¢. 
According to the variational principle, E > Eo, the lowest energy state 
of the system. 

We wish to find the condition that E shall be a minimum, and the mini- 
mal value of #. Insertion of (94) gives 


n 
E= ¥ aI, / E ata,An, 
Amel du 


This expression will be an extremum, and we hope a minimum, if Æ is so 
adjusted that oE /ðaž and d#/da, are zero for every k from 1 ton. Let us 
take the derivative with respect to af on both sides of the last equation 
after it is written in the form 


E YokaAy, = Lata Hr (11-95) 
au 
The result is 
oH * 
ak Laka, Ax, -+ EE apAku = Eon K ku k= 1,2, n 
k du h ry 


When the first term is omitted (@E/dajf = 0) the remainder of the 
equation represents the condition that Æ shall be a minimum. Differentia- 
tion of (95) with respect to a, leads in a similar way to 


E Z ax Aak = ZK 


an equation which is simply the conjugate of the former. Both may con 
veniently be written 


Dota (KH key —_ AnH) = 0, k= 1, 2, sey (11-96) 
B 


If this system of equations is to have a solution different from the trivial 
one: every a, = 0, then the determinant constructed from the coefficients 
of the a, must vanish. Thus 

Hy ~ Au” Hiz— AE +++ Hin — Anz 


Ka — AnH Haz — AE +--+ Hon — Ant 
a ALLL, E Am, =0 (11-97) 


Kni - An Kana - An E see Kan —_ Anal 


This is an equation of the n-th degree in E and therefore has n roots. Tha 
lowest of these will be an approximation to the lowest energy of the system. 
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"Whe other roots approximate, though in general much more poorly, to the 
7 — 1 higher states of the system. 

11.21. Example: The Hydrogen Molecular Ion Problem.—The 
H -ion consists of two positive charges +e, which we shall consider station- 
ary and a distance R apart, and one electron whose distances from the 
protons will be denoted by ra and rg. See Fig. 5. 

The Hamiltonian operator is 


fe 
A R B 
Fra. 11-5 


+e 


Tf the terms e?/R — e?/rg were missing, H would be the Hamiltonian of a 
hydrogen atom with its proton at A, whose normal state function is (cf. 
eq. 67) 


ua = (rag) ~Y/2,—r g/aa 


On the other hand, if the terms e?/R — e?/r4 were missing, the normal state 
function would be 
us = (na) Vora! 


From a physical point of view one of these solutions is as good as the other: 
ua implies that the electron is entirely attached to proton A, up that it is 
attached to proton B. Neither is the case. Let us see what happens if 
we take for the variation function ¢ a linear combination’ of ua and wa. 
We put 

$ = aaua + ABUR 


using letters as subscripts rather than the number indices which appear in 
(94).?2! The lowest energy is at once obtained as the lowest roct of (97) 
which takes the simple form 


Has — Anak Haz — AazE 
= 0 (11-98) 
Hes — Apa E Kap — ArgE 


21 In more complicated molecules it is well to label electrons by numbers, nuclei by 
letters. We here follow this convention. 
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Now Aga = f utusdr = Agng = 1, because wa and ug are normalized. 


They are not orthogonal, but Aag = Aza. Similarly, Hap = f usHupdr 


= Hp, and Hgr = H 44 since H is insensitive to an interchange of A and 
B. With these simplifications the tivo roots of (98) are found to be 


Kaa +KHap _ Kas — Har 
KE, = 1 h = —— 
1+ AaB 1— Aar 
The ¢-functions corresponding to these energies are obtained from (96): 
aa(Kaa — E) + ap(Hag — Aast) = 0 
aa(K pa — ApaE) + ag(K gg — E) = 0 
On inserting E = E, weget aa = ag; hence the corresponding 
$1 = c (ua + uB) 


If ¢ı is to be normalized, ¢, = [2(1 + Aus)l'”7. If Hp is inserted in (99), 
we find ag = ~aa, 80 that 


(11-99) 


og = C2(UA — Up) 


The normalizing factor is in this case cg = [2(1 — Aa ay’ 2, 

The remainder of the work is the computation of the three quantities 
Aas, Xaa and Kap. It involves nothing new and will be left to the 
reader. The integrals are most easily evaluated in spheroidal coordinates 
(cf. eq. 5-40). = (ra-+re)/R, 1 = (ra — 78)/R and g, the latter 
measured around R. In terms of these 


3 
dr = Ze — 9? )dtdndy, and uaug = (rag) te" 70t 
1<&<»; -l<nK<l 
The following results will be found: 


p? 
dag =e? ( +pt+ £) 
e? 
Kaa = Ex tR +J, where 
e? e —2 
J=- f uE ud = 50- ao] $0110) 


2 
Kar = (2x + =) AaB +K, where 


my 
1 


e? e 
— ———— = — —pe 2 
fous updr = R (e + p*) 
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The parameter p = R/ao; Ex is defined as in sec. 19. The quanti- 
ties J and K are of interest. According to its definition, J represents the 
Coulomb attraction energy between a negative charge of density å 
and the proton B. The integral K has no such simple interpretation; it is 
Called an exchange integral. Its importance is best appreciated if E, and 
Æ? are written more explicitly with the use of (100): 


è J+K 

E= E — 
1 atEetitag 
e J-K 
Eo = Ey +—-+-———— 
2 ntti Ags 


Because K is negative, Æ; is the lower root. Had we omitted the func- 
tion upg from our trial function ¢, the variational result would have been 


2 
E=Ext+tptd 


Æ is lower than this by virtue of the presence of K (and of course Aap). 
Bout in classical parlance, a lower energy must be regarded as due to the 
presence of additional attractive forces between the constituents of the 
system, i.e, a hydrogen atom and a proton. These forces would be 
given by 0K/dR; they are commonly called exchange forces. They 
Possess no classical interpretation; their significance is rooted entirely in the 
variational method through which they arise. 

Of course Æ, is only an approximation to the true energy, which is 
Lower for every R? Its most important feature is that it possesses a 
aninimum, which explains the stability of the H + ion. Classical mechanics 
‘would yield no minimum and is therefore incompetent to account for the 
existence of this ion. A detailed comparison of E, with the experimental 
energy is given in Pauling and Wilson.” 


Problem. Let uo, ur uz be the three lowest energy states of the simple harmonic 
oscillator, Ho its Hamiltonian. The Hamiltonian for an oscillator in an electric field is 
H = Hy + ka, where k is a constant. Calculate by the variational method the lowest 
energy of this system, using as trial functions (a) uo, (b) aouo + ati, (¢) aot + arty + 
202. 


Ans. (a) bhn, b) hy — V kër + a (hy)? = Shy — k?rgi/hv, (0) fh» — hik?zo/ 


[<hv)? — kate] (approximately). Here xi; is defined as | wixu,dr, as usual. 


11.22. Perturbation Theory.—The following problem is frequently met 
in quantum mechanics. We know the energy states of a given system, say 
an atom, and also its eigenfunctions. A small perturbation, such as an 


22 Pauling and Wilson, loc. cit. 
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electric or magnetic field, is now imposed; this changes, presumably by 
slight amounts, both energies and state functions. Mathematically, the 
situation is described in this way. We know the solutions and eigen values 
of 

Hy; = EM: (11-101) 


-where H? is the “ unperturbed ” Hamiltonian. We wish to find solutions 
and eigenvalues of 


He; = Edi, H =B? +H’ (11-102) 


H’ being considered as a “ small ” addition to H°. (By a small operator 
we mean one whose matrix elements, formed with the functions y,, are all 
small compared with the diagonal elements of H°.) 

To solve the problem we use the method of linear variation functions, 
using as our trial function 


$ = Lav (11-103) 


If we allow an infinite number of terms in this summation and choose 
the coefficients properly, we expect ¢ to be the correct solution of (102), for 
the x of (101) form a complete set. But since the W; are orthonormal, the 
energies are given as the roots of (97) with every A; replaced by a 
Kronecker 6;;, so that E appears only in the principal diagonal. More- 
over, 


Ky = Hy = (A )ej + HG = Ey + HG 
Hi; = fE 


Hence the determinant reads 


| Hn- (E-E}) His His Hl, 
Ha Hp- (E-E) Hh Hy vee 
Hh Hy = He-(-B3) Hs cel o 
Hia Hiz Hiz Hur (E— FÌ) -e 
(11-104) 


Tf all its roots could actually be found they would indeed be the exact 
energies of our problem. But in the case we are visualizing certain simpli- 
fying approximations are in order. Suppose we are interested in the 
energy Eh, that is, the energy to which E? is changed by the perturbation. 
(E, need not be the lowest energy of our system, for the states may be 
labeled in an arbitrary order.) If E? is a non-degenerate level, then E; 
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will lie much closer to Ef than to any other unperturbed E?. This suggests 
the following approximations: 


a. Put E = E? in all diagonal elements except the first. 


b. Since every difference E? — E? for i # 1 is large compared to Hi, 
the latter may be omitted in all diagonal elements except the first. 

c. Neglect all non-diagonal elements except those in the first row and 
the first column, since they affect /, only in a secondary way. 


When this is done, the determinant reads (we now write AE, for the 
Perturbation E — By we are seeking) 


Hh 0 R-E 0 U =0 (11-105) 


It may be evaluated by the usual process of adding multiples of rows or 
columns. In this instance, multiply the second row by Hi2/(E2 — E%), 
sand then subtract it from the first. The element Hiz will then disappear 
£rom the first row, but the first element is converted into 
Hi2Hn 
ER — Ei 

Next, multiply the third row by H 13/(E3 — E?) and subtract it from 
the first. The result will be disappearance of Hi, and addition of 
— HH / (E$ — E®) to the first element. This process is continued until 
all non-diagonal elements of the first row have disappeared. We now have 


(m-am - Egg) A DR Bm 


Hi, — AE, — 


If EQ is non-degenerate, as we are supposing, none of the parentheses 
except the first can be zero. We therefore conclude 


: ` Ho Hi 
AE, = Hi — = 11-106 
=H- Lew (11-106) 


and this is the Rayleigh-Schrédinger perturbation formula. The quantity 
H? is often called the first-order perturbation, the sum on the right is called 
the second-order perturbation. By retaining more elements in (104) third 
and higher orders may be computed, but these are rarely used. When the 
approximation (106) is not sufficient it is generally preferable to return to 
the variation scheme, or to find a more successful way of evaluating the 
determinant (104). 


11.22 QUANTUM MECHANICS 390 


Formula (106) may, of course, be used to calculate the perturbation in 
any energy level which is non-degenerate; to show this fact it may be 
written in the form 


AF, = Hy. — y Hal (11-107) 
B-E 


where we have also used the Hermitian property of Hj,. The prime on the 
summation symbol indicates that the term in which A = k should be 
omitted. 

Next, let us find the coefficients a, in (103), They are obtained from 
(96) which now reads 


La, (Elk, + Hi, — Ed.) =0, & =1,2,--- 
H 


In accordance with the approximations which led to eq. (106) we put 
E = E} and neglect every Hj, unless one of the subscripts is 1. We then 
find 


Hy + aE} — E?) =0 if k=2 
aiHzı + o (E3 — E9) = 0 if k= 3, ete. 


Hence 


H! 
a, = p y a1, Nw] 
A 


in 


r 


or in general, if we are interested not in Æ; hut in Er, 
Hix 
® = p e AEk 11-108 
H- Ee" (11-108) 


The coefficient a, must be chosen so that ¢ is normalized. Since all other 
a are small, its value is very nearly unity and may be taken as such. 
Formulas (107) and (108) have been derived by assuming that the level, 
k, whose perturbation is being calculated, was non-degenerate. For 
degenerate levels both formulas obviously fail, for they contain terms with 
vanishing denominators (several Æ? being equal to Æ), To deal with the 
case of degeneracy we have to return to the fundamental determinant 
(104). If the functions 24, uz, ++ +, Un all belong to the same energy F? (we 
then say that the level Æ} has an n-fold degeneracy), these functions are 
equally concerned in the perturbation, and if we formerly retained all 
matrix elements of the form Hin, we must now retain Hin, Hy, =, H’y 
also. But for most purposes sufficient accuracy results if we neglect all 
elements connecting a state of the degenerate group with all states not 
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belonging to that group. Eq. (104) reduces in this case to 


i — AE Hiz His Hin 
Hha He- AE Hos Hon 
H4; Hio Hia — AE... Han = 0 (11-109) 
Hin n2 Hrs H’, — AE 


the n roots of this equation (of which some may coincide) are the energies 
into which Æ? will “ split ” as the result of the perturbation. They can- 
not, of course, be represented by a general formula. 

These energies are said to represent the first-order perturbation. If 
greater accuracy is desired the work may be continued in this way. By 
substituting the first-order energies into eqs. (96) and neglecting all states 
not belonging to the degenerate group, 7 sets of coeffieients a1, G2, `- ', Gn 
are found, each set belonging to a single first-order energy. This yields 
n functions 


n 
v = L aa 
XEL 
If now we construct matrix elements with the v-functions, Hi; = 
f ož Hv;dr, these will be diagonal; for solving (109) is the well-known 


procedure for diagonalizing the matrix H’. (See Chapter 10.) Hence, 
when the v-functions are chosen to represent the n degenerate states, the 
second order perturbation can be computed by formula (107), from which 
the terms with vanishing denominator are now absent because every Hi, 
corresponding to them is zero, 

11.23. Example: Non-Degenerate Case. The Stark Effect —Let 
H? represent the Hamiltonian operator {or any one-electron system, and 
let vi, Wa, be its eigenfunctions. When a uniform electric field along X is 
applied, the term H’ = —eFz is added to H? e being the electronic charge 
and F the field strength. The normal state of the system 'is non-degener- 
ate, hence formula (107) may be used. Denoting the normal state by the 
subscript zero, we find 


= _ 2p | Zox |? _ 
AE eF ron — eF x nw (11-110) 


Here x, = f yioydr. The first term on the right is usually zero because 


| Vo |? is an even function of z; thus the “ first-order Stark effect ” is absent. 
In classical physics, the increment in energy of an atom due to a static 
electric field is expressed in terms of the polarizability « in the form 


AK = —taF? 
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On comparing this with (110) we find for the polarizability of the normal 
state of our system 


For an oscillator, this takes a particularly simple form, since all Zor 
vanish with the exception of zo) = V 1/28 (ef. Chapter 3, eqs. 92 and 93). 
Also, Ep = (A + 4)hy. Thus 


Comparison with the problem of sec. 21 shows that second-order per- 
turbation theory gives in this instance the same result as the variational 
method with the trial function aoyo + ayy. In general, however, the 
use of a simple variation function yields a much poorer result for the 
polarizability than the method of sec. 22. 

11.24. Example: Degenerate Case. The Normal Zeeman Effect.— 
The energy states of the hydrogen atom were found to be 


Rn (r) Y, (8p) 
To a given J, there belong 2! + 1 spherical harmonics of the form 


l a 
Y= 2 CmPP (cos #)e"", (P™ = PP) 
mot 


and each such combination with its own set of coefficients cm, forms a proper 
eigenfunction when multiplied by Rn, The energy does not depend on m: 
the state under consideration has therefore a (21 + 1)-fold degeneracy. 

Let us choose the 2/ + 1 functions in the simplest possible way, namely 
by letting each F, contain only one term, as follows: 


il 1,—id-1 vl 
Rai eaPie™?, Reie eapP te MD? Ray Pl. 


and label them di, be, °° 5 21415 in that order. 

The Zeeman effect is the splitting of the energy levels of an atom in a 
magnetic field. When a uniform magnetic field along the Z-axis and of 
strength F is applied to the hydrogen atom, its unperturbed Hamiltonian 
takes on the extra term? 

H = — Ñe pa = iA 2 
2Mc dy dy 


Each matrix element H/; = f ¢ž H'¢ġ;d7 contains the factor f RÈ adr 


2 See Van Vleck, J. H., “ The Theory of Electric and Magnetic Susceptibilities,” 
Oxford, 1932. We write here M for the electron mass to avoid conflict with the summa- 
tion index m (magnetic quantum number). Note that H’ is the quantum representation 
of (¢/2M.)F + L, where Lis the angular momentum veotor of sec. 11.3. 
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which, by virtue of the normalization of the radial functions, is unity. If 
2r 
. : . vo O 
we form, e.g., Hia we obtain an integral over 9 times f er 
0 p 
and this vanishes In the same way all other non-diagonal matrix ele- 
ments are seen to be zero. The diagonal element H fr is 


e UDP da, 


—iA- (i). [stort = —Al 


and the others are similarly constructed. 
When these elements are substituted into (109) we have 


—1A — AE 0 0 0 

0 -~(|-1)A — AE 0 0 

0 0 —(l-2)4—-AE 0 =0 
ccd eb ewe teen eee ee eee nena ee en eee 0 

0 0 EEEE IA — AE 


The determinant is already diagonal, our choice of functions was a 
fortunate one. The perturbed energies are clearly 
heF 
AE = mA = mon m= —-l, -+l 0, 1l 
Classically, an electron in a magnetic field F performs a uniform preces- 
sion of angular frequency wr = eF /2Mc, known as the Larmor frequency. 
Thus we see that AE = mhw,. 


Problem. Calculate the Stark effect of the rigid rotator (of. sec. 11.12), for the 
state Z = 3, adopting the same choice for the spherical harmonies as above. Here 
H' = ~eaF cos 9, provided the clectric field F is along Z. The determinant will not be 
diagonal. To calculate the matrix elements, use formulas (8-48 and 53). Include in 
your calculation successively more states:) = 2,83, 4; l = 1,2,8,4, 5. 


TIME-DEPENDENT STATES. SCHRODINGER’S TIME EQUATION 


11.25. General Considerations.—In all preceding considerations we 
have assumed that the states of the systems in question were stationary 
ones, that the time coordinate could be disregarded in describing them. In 
generalizing the theory so as to make it applicable to states which change in 
time it is well to look back and see why a time-free description was possible 
thus far. 

It is important to note that the time, £, in classical mechanics is canoni- 
cally conjugate to the energy, E, in the same sense that z is conjugate to 
yz. Let us then for the moment consider the operator Py = —ih(d/dz). 
Its eigenstates were seen to be (cf. eq. 9) Yp = cete. What do they tell 
us about the distribution of the system in x? The answer is, it is uniform. 
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Whatever is true at the point zı, is also true at the point a». This is the 
meaning of the uncertainty principle applied to the case at hand: if the 
momentum is known with certainty, the state function is entirely non- 
committal with regard to z. If in the calculation of the mean value of an 
operator Q, 


g= fahi 


Q did not depend on z, we could have afforded to neglect the factor e ™™ of 
Yp altogether. It had to be included, however, because most operators of 
interest do depend on =. 

But this trivial situation existed with regard to the time coordinate in 
all the Schrédinger problems considered heretofore. The states were those 
in which the energy was known with certainty, and for this reason the state 
functions were completely indiscriminate in respect to i. What was true 
at t; was also true at t. Moreover, the other operators used were inde- 
pendent of 4. This condition will always be present as long as we are deal- 
ing with closed systems, for the energy will then be constant in time. 

When the system is an open one, the present method must clearly fail. 
But the last remarks contain the hint that we should, perhaps, associate 
with Æ the operator —iħ(ð/ðt). This would lead to the eigenvalue 
equation 


—th = Hy 


which is certainly too simple because the energy depends on other things 
beside the time. The example above gives us no definite lead at this point 
because pz does possess the single dependence on x. There is, however, 
only one reasonable way to include these other variables, namely, to put 
them into Æ, which thereby ceases to be an eigenvalue: Æ must be replaced 
by the Hamiltonian operator I. We then arrive at Schrédinger’s time 
equation 


ð 
-iñ = Hy (11-111) 


H is to he constructed as before by replacement of every Cartesian 
coordinate p; by —1h(@/dg;) and the dependence on ¢ is to be introduced 
explicitly. 

It is immaterial, of course, whether we choose eq. (111) or its complex 
conjugate equation. The latter choice has certain advantages and will 
here be made. Furthermore, we shall use the symbol u (more or less 
generally) for time-dependent state functions and thus record Schrédinger’s 
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time equation in the form 

o Oulgi `t- Gut) » 9 sô, . . 

ih —— a = H ho ha qi’ ++ Gn} t ulgi: an; t) 
(11-112) 


This equation, being of the first order in ¢, permits prediction of the state u 
at any future (or past) time when u is known as a function of the coordi- 
nates at present. Although it is closely related to the preceding develop- 
ments, eq. (112) is a new postulate not derivable from those already given. 

The present theory must be valid also in the special case when H does 
not contain t When that is true eq. (112) is separable. On writing 
u = y (qı > + ga) - f(t) it becomes equivalent to the equation 


of 
Ay _, ot 
a = h—- 
y "T 


each side of which must represent a constant. Butin view of the form of 
the left-hand side, that constant must be one of the eigenvalues of the 
operator H, say Ey, 80 that 


af —1tEy 

a A f 
Hence 

fe ce CENAN 


The general solution of eq. (112) for the special case in which H is inde- 


pendent of the time is 
u = Dare vn (11-113) 
xn 


We have formerly said that any state function, such as u, could be. 
expanded in the orthonormal system of functions ya. This expansion was 
written as 


u = Lan 
A 


We now see that this is indeed true even when the analysis is made on 
the basis of eq. (112), but the coefficients a, are always functions of the 
time: a = oe N, The mean value of E, computed for the state 
(113), is . 
E = pa o 2E = zi ay |En 


It is independent of 4 But the probability of finding the system at the 
point gi'an of configuration space, u*u = Leto epee er Bot, is a 


H 
superposition of oscillating functions of the time. The only way for this 
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time dependence to be obliterated would be to have e = ô; in (118), in 
which case 


u*u = ob; 

Thus, whenever a state is formed by superposition of energy eigenstates, 
the mean energy of the system remains constant, but the configuration of 
the system changes in time. The reader should note, of course, that the 
solution of the Schrödinger equation (12) when multiplied by e7 °#”" ig 
also a solution of (112), but that the solution of (112) does not in general 
satisfy (12). 


Problem. Let the time-dependent Hamiltonian be H = Ho + V(t), where Ho 
acts only on space coordinates and has eigenfunctions yp, eigenvalues Fa Show that 


u= x eye” YA Batt S Vad 


11.26. The Free Particle; Wave Packets.—The eigenfunctions of the 
energy of a free mass point (cf. sec. 11.9) moving in one dimension without 
2 


. en: . hi . . 
restriction are yy = e'**, its energies Ep = om k?, and there is no quantiza- 


tion. The general solution of eq. (112) for the free particle is therefore, 


u= f o(Ke) ee 2m) #4 gy (11-114) 


— 


a function constructed after the manner of (113) but with an integral 
instead of a sum. An integral very similar to this has been already 
encountered in the mathematical formulation of waves (cf. eq. 7-88) and 
of diffusion phenomena (eq. 7-53). It is interesting to inquire what form 
u will have at some time ¢ if at ¢ = 0 it is given by u = u(x). The 
coefficient c(k) may be determined by Fourier analysis. We have 


Ug = f c(k)e**dk 
whence by eq. 8-13 7 
ot) = f Oea 
Eq. (114) therefore reads 
u(a,t) = = Sf f j uo (fe KEO- O/2m IFA ag ay 


In this instance, the integration over k cannot be performed (as it 
could in the diffusion problem, sec. 7.14). To proceed further it is neces- 
sary to introduce the function up explicitly. 
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Assume that uo = e7?/22" Then, with the use of the formula 
f ge ttt, = (row (11-115) 
we find 
a 272 
c(k) = eke 
Von 
Hence 
—__2 ° hla? /2+i(h/2m \el-bike 
u=— e dk 
V =S æ 


( A yT z? 
= |i +i! — | —— - 
+ tae ) exp 7 (11-116) 


2 G + ii e) 
m 
again with the aid of (115). 

Eq. (114) represents a superposition of waves of wave length 2r/k and 
frequency v= (hi/ 4am)k?. The form of uy here chosen describes a concen- 
tration of waves about the origin, a phenomenon called a “ wave packet.” 
Such a wave packet does not retain its spatial distribution; eq. (116) is 
characteristic of the manner in which it diffuses. 

From the point of view of quantum mechanics, ua is the probability 
density of the particle at = 0. It represents a Gauss error function of 
< width” a. At time t, 


i, 27 —1/2 x 
vue p(y] 0 -[—Se— 
ma 


hv 
2 2 
a + at 
ma? 


‘The probability density is still a Gauss function, but of smaller maximum 
and of width [a? + (f2/m?a?)i7))”. 


Problem a. Compute how long it would take an electron, localized within 
= 107! cm., to diffuse through twice that distance. 
b. How long would it take an object weighing one gram, localized within 1 cm., to 
diffuse through twice that distance? 


c. Show that if uo = ce“**, where K is a constant, the wave will be of the form 
ag = ikea (h/2m Ke 


If our particle is free to move in three dimensions, then as shown in 
sec. 11.9, 
. k? 
Ye = e1, and Ey =k? 
am 


Hence (114) has the form 
u= f c (kJet tT M2m di (11-117) 
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Again, if u = uols, yz) ati = 0 

Ug = f c(k)e™"dk 
whence by 3-dimensional Fourier analysis 


1 
c(h) = <5 f volimte de 


the vector p having components $, 7, |. 
Assume now, in analogy with the one-dimensional case, that 
ug = et 2a? 
Att = 0 the wave packet is a spherical concentration of waves centered 
about the origin; the probability packet has a similar shape and a width a. 
On inserting u into the relation for e we have 


i ° -i — fy? 2y i, pas —¢ 
c({k) = 33 f e120") thikge f en /2a") bad e (t2/2a*) thst de 


3 
= (+) eo PRW 
V 2r 
This gives 


u = (2r) 732a? f pT LO/D HOR Hrg 


times two similar integrals with k; replaced by kz and ka,x by yand z. Hence 


th N 
u= (1+2) exp — TT ey 
G +i— t) 
m 
The interpretation of this result is not different from that of (116). 

Before leaving the subject of “ particle waves,” we should remark that 
every component wave of the packet (117), being of the form e@*~?”), 
travels in a positive direction along k. Had we chosen the sign as in eq. 
(111) and not as in (112), the waves would have been of the form 
gir tint) which implies that they travel along —k. Since kh represents 
the momentum of the particle, the latter choice is an unsuitable one. We 
also note that the wave length > = 2r/k = 2rħ/mv = h/mv conforms to 
the De Broglie formula. The phase velocity of the waves is »\ = hk/2m = 
my/2m = v/2, but their group velocity,” defined as 2x(dv/dk) = v, is 
equal to the classical speed of the particle. 


r2 


24 For a discussion of group velocity, see Sommerfeld, A., “ Wellenmechanischer 
Ergiinzungsband,” Friedr. Vieweg & Sohn, Braunschweig, 1929, p. 46. 
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11.27. Equation of Continuity, Current—If the state function changes 
in time in accordance with the Schrédinger equation 


Hu = thu é11-118) 


will it remain normalized? If it does not, there occurs a destruction or 
creation of probability; while initially there was certainty of finding the 
particle somewhere in space, there might later be uncertainty, a situation 
which would clearly be physically untenable. Permanence of normaliza- 
tion, however, follows immediately from (118). For 


2 f u*üdr = f [wu + wkuldr = r f [uH*u* — u*Huldr 
because of (118), and the last expression is zero on account of the Hermitian 
character of H. 

Having shown that u*u is conserved we can define a probability current 
by subjecting u*u, which we will call p for the moment, to the equation of 
continuity 


op _ 7 
at Vv E=9 (11-119) 


Whatever I turns out to be must be regarded as the current correspond- 
ing to the “ flow ” of the quantity u*u. We shall limit our consideration 
to the case of a single particle so that 


K2 
H=- mY + V (z,y,2) 
although generalization to many-dimensioned configuration space is easy. 


Again because of (118) 


ĉe = ù*u + utu = ; (uH *u* — u*Hu) 

. + 
i (u*y u — uV?ut) = V. É (u*Vu — uyw | 
m 2m 


To satisfy (119) we must put" 


= — 2 (u*Vu — uVu*) (11-120) 


It is interesting to observe that a state u which has no complex depend- 
ence on a space variable has no current associated with it. Thus, in the 


25 This form of I is correct so long as the potential energy V is of the sesalar form 
here used. When H contains a vetor potential, A, the term (e/c)A must be added to 
the expression for the current ‘here given. 
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free particle problem, cos kx and sin kx represent stationary states, but 
ett ond e ** have currents. 


Problem. Compute 7 for the various regions of the barrier problems considered 
in sec. 11.10. 


11.28. Application of Schrédinger’s Time Equation. Simple Radia- 
tion Theory—The cases in which eq. (118) can be solved exactly are not 
numerous and not very interesting. When the time equation (118) is not 
separable, resort must be taken to approximation methods, the most useful 


of which will now be illustrated. 
Let an atom, whose normal Hamiltonian function, free from all per- 


turbations, is Ho, be suddenly subjected to a light wave which adds a 
perturbing energy 
V(2,t) = —eF ox sin wt (11-121) 


to H. Physically, this means the light wave is monochromatic and has 
frequency v = w/2z; its electric vector is along X and of amplitude Fo. 
If V did not contain z and sin wt in product form, eq. (118) with 
H = Ho + V would be separable; the fusion of x and t into V spoils 
separability. 

In solving (118) we use the following initial condition: At ż = 0, when 
the atom was exposed to the perturbation V, the atom. was certainly in an 
eigenstate of the operator Ho, say in the state y, corresponding to the 
energy Eı which we shall take to be the lowest energy of the system. Or, 
if we wish to include the trivial time dependence of the state, we take 


u = ye CP/M! (11-122) 
The solution of 
(Ho + Vy = ib (11-123) 
which we desire, is certainly available in the form 
v= Zane SEN (11-124) 


2 Bg, (121) is a valid approximation for the purpose at hand. It neglects the 
energy due to the magnetic vector of the light wave whose contribution is small com- 
pared to (121) in the ratio v/c, where v is the velocity of the charge composing the atom 
and c the velocity of light. For hydrogen, v/e is 1/137. Furthermore, eq. (121) implies 
that the wave length of the light is large compared with the size of the atom. Correctly, 


. 2rz “age . ` 
V = —eF oe sin (ut — zA, and we are omitting the termz/s. The legitimacy of this 


will be clear from the following analysis. 
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provided we let the coefficients c be functions of the time. This follows 
immediately from the completeness of the y} with respect to functions of 
the space coordinates. When (124) is substituted into (123), there results 
Zo (Hopr + Vaje Ge! = E (Erha + ihin eT’ 
wherein each term Hoy, on the left cancels Faya on the right. Let us now 
multiply the remaining terms of the equation by yf and integrate over con- 
figuration space, remembering the orthogonality of the y). Then, after 
simple rearrangement, 


Lo Vme PENA = 1,2,8,- (11-125) 
x 


. t 
i= -7 
h 


where, as usual, 
Vian = f YV ydr 


If the unperturbed atom has an infinite number of states, (125) repre- 
sents an infinite set of linear differential equations, which in general can 
not be solved. But we now recall that at? = 0, v = u; which means that 
all cp except cı were zero at that time. Thereafter c; decayed from 1 to 
some smaller value, while all other c’s grew from 0 to various finite values. 
We now limit our inquiry to times so small that c; is still sensibly unity, and 
the other c’s are small compared with it, although ¢; may be quite compa- 
rable with the time derivatives of other c’s. This permits the approxima- 
tion of replacing every c, on the right-hand side of (125) by its value at 
t = 0, while retaining every ¢,. The equation then beomes 


. i “ave 
ch = — z Vye Er Eyyt 


To simplify writing we introduce the abbreviation 


Ey — Ei 
hi 


= Wr 


and observe that every w > 0, since, as we are assuming, Hy is the lowest 
energy state. In view of (121), 


iw ) 


Vi = — eF ote sin wt = dieF ots (e — € 


so that 


eFo ilwp todt (wy, —w)t 
Ch = > Ly} ek — ek 
E o} [ - 
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On integration, 


ieFo gi, — 1 gilog tort — ‘| 
„Eo — k*l 
Ck 2h sa | WE — W a, Hw ' 


where we have at once adjusted the constant of integration so that cp = 0 
when t= 0. For physical reasons, only the first term in the square 
parenthesis need be retained because it alone can attain appreciable magni- 
tude. (Both w and wg > 0.) In fact cz is large only when w = ax, and 
this fact is accentuated when c; is squared: 


12 WE — a 
Fo j = 208 (or = a) ES e K 2 V4] 
ane | (op —- wane | (“ Z e 
2 


(11-126) 


We now interpret this result. The coefficient cz is, in view of (124), 
the k-th probability amplitude in the expansion of the state function v at 
time żin terms of energy eigenstates of the normal atom. Hence because 
of sec. 5, | cx |? is the probability that at time t the k-th energy level of the 
atom be excited; it is the “ transition probability ” from state 1 to state k 
when the atom has been exposed to monochromatic light of frequency 
w/2r for t seconds. 

Many interesting conclusions of a physical nature can be drawn from 
eq. (126), of which only two will here be mentioned. First, the transition 
probability is proportional to the square of the matrix element connecting 
the states in question. Whenever 21, vanishes, | c |? = 0. Hence the 
vanishing of 2; is the criterion of a “ forbidden ” transition. In the second 
place, the transition probability is small unless w = wy, which is the Bohr 
frequency condition. 


|a} = 


Problem. The reader may be surprised to find that| cp [2 is not a linear function of #, 
as might be expected on physical grounds. Show that, when the incident light forms a 
continuous spectrum of uniform intensity, | cx |* is proportional to £ (For this purpose, 
(126) must be integrated over œ from Oto «©; but the integration may without appreci- 
able error be taken from — © to +.) 


ELECTRON SPIN. PAULI THEORY 


11.29. Fundamentals of the Theory.—The theory so far developed 
describes the general behavior of atomic and molecular systems surprisingly 
well, but it makes some false predictions, particularly with regard to the 
finer details of the energy states of atoms, the Zeeman effect, and the mag- 
netic properties of electrons. It was soon apparent that the state of a single 
electron could not be represented as a function of three space coordinates 
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alone, but that another parameter was required whose interpretation was 
for some time in doubt. Most decisive in clarifying the situation was the 
spectroscopic observation of the doubling of the energy levels of a single 
electron: In all alkali atoms, for instance, two levels are found where the 
Schrodinger equation permits only one. The energy difference between 
these levels was such as would be produced by a small magnet of magnetic 
moment fie/2mc setting itself once parallel and then opposite to the mag 
netic field present in the atom on account of the electron’s revolu- 
tion. Also, the angular momentum corresponding to these two energy 
states was known to be different; it was equal to that caused by the elec- 
tron’s orbital motion, plus 4/2 in one, minus 4/2 in the other state. 

Uhlenbeck and Goudsmit suggested that the electron behaves like a 
spinning top having a “spin” angular momentum of magnitude A/2 
which, however, can only add or subtract its whole amount, in quantum 
fashion, to any angular momentum the electron already possesses as & 
result of its orbital motion. Correspondingly, the electron generates by its 
spin a magnetic moment of magnitude fe/2me (m is the electron mass, c the 
velocity of light), and this also communicates itself in foto, either parallel 
or in opposition, to any magnetic moment already present. 

To describe the electron spin as an angular momentum of the usual kind 
and to associate with it an operator like L (eq. 44) proved a fruitless under- 
taking, chiefly because Z would have more than two eigenstates. The most 
successful procedure of including the spin in the quantum mechanical for- 
malism, aside from Dirac’s relativistic treatment of the electron, is that of 
Pauli which will now be deseribed. What follows will refer only to the 
spin states of a single electron; some applications to several electrons may 
be found in sees. 34 and 35. 

Since the three space coordinates are insufficient to specify the complete 
state of an electron, we introduce a fourth, the “ spin coordinate,” and 
denote it by s}. It corresponds, in classical language, to the cosine of the 
angle between the axis of the spin angular momentum and the Z-axis of 
coordinates. This visual interpretation, while in no way dictated by the 
mathematical formalism, will be found a useful mental aid. Thus the 
state funetion of an electron has the form 


$(%,Y,2,82) 


Since in all that follows, the hypothetical spin coordinates s, and sy are 
never needed, we shall henceforth delete the subscript z on s, but,retain the 
above interpretation. Hence ¢ = ¢(2,y,2,s). Finally, it is well for the 
moment to abstract attention entirely from the space dependent part of 
the wave function, i.e., to consider x, y, z as fixed, concentrating our inquiry 
solely upon the electron spin. Then¢ = ¢(s). 
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If .s, like x,y and 2, were permitted to assume a continuous range of 
values, difficulties would result. Pauli therefore postulates—in a Manner 
admittedly ad hoc and designed to force success of the theory—that the 
range of s consists of only two points: s = +1 (classical meaning: spin 
vector is parallel or in opposition to Z). A function of s is therefore 
defined only at these two points. The most general spin function is, 
accordingly, 


$(s) = 8a +1 + bb, 1 (11~127) 


where the ô’s are Kronecker symbols. 

Our postulates involved certain integrals over configuration space. 
But an integral over configuration space consisting of two points vanishes. 
It becomes necessary to redefine the integral as a summation over the two 
points: 


[Feds = F(-1) + FO) 
If (s) is to be normalized, 
f (|a Porta + |b |787, + (a*b + b*0)5,,415:,—1) ds 


=|a)?+|bP=1 (11-128) 


In a very trivial sense, eq. (127) represents an expansion of a function 
¢(s) in a complete orthonormal set of functions, 5,11 and ôs,—ı- To what 
operator do these two functions belong as eigenstates? The answer is 
suggested by intuition arid will be justified by its complete success; it is 
the operator S, which is associated with the observable: spin angular 
momentum along Z. We must now give thought to the mathematical 
structure of this operator. 

Empirical evidence cited in the introductory paragraphs demands that 
its two eigenvalues be +ħ/2. Hence it must satisfy the two equations 


h 
S541 = get 
(11-129) 


h 
S551 = = gôr 


It is possible to show that no differential operator of the type encountered 
previously can satisfy these equations withoutgiving riseto an infinitenum- 
ber of other eigenstates. But why search for the operator? The simplest 
point of view, and that here taken, is to regard eqs. (129) as a definition of 
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the operator S.. The result of applying S, to the most general function 
of s (eq. 126), can be constructed on the basis of (129), hence (129) exhausts 
the meaning of S, and is its definition. 

To simplify the notation, and to be in accord with custom, we now intro- 
duce the symbol a(s) for 6,41, and 8(s) for sı. Furthermore, we 
define a new operator 


=, 


which has eigenvalues =1, for the simple expedient to save writing. Then, 
in view of (129), 


a,a(s) = a(s), o(s) = —8A(s) (11-130) 
It is indeed possible and often useful to find an explicit operator in form 


of a matriz which will satisfy these equations. This matrix is easily formed 
by means of the principles outlined in sec. 17. Our eigenstates are y] = a, 


vo = B, and we construct (¢z)i; = f yio,V;dr with the integral replaced 


by asummation. We thus obtain the two-square matrix 


o: = ( 4) (11-131) 


To let it operate on what was formerly the function ¢(s) the latter has 
to be regarded as a vector whose components are its expansion coefficients: 
If the function ¢ is given by 


p(8) = da + bR 


a and b being numbers, then the vector (s) is 


s= (3) 


Thus, in the matrix representation, 


ro = G DO (11-132) 


and the reader will easily verify oy the rules of Chapter 10 that the two 
0 
eigenvectors of e, are ¢ = À) and @ = O) where the values of both a 


27 An operator P is in general uniquely determined when the result of its action 
upon each member of an orthonormal set of functions is known. This method of defin- 
ing an operator is ordinarily not useful because an infinite number of relations like (129) 
would be required. 
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and b must be unity because of (128). The eigenvalues are, respectively, 


+1and —1. But the functions œ corresponding to the vectors (o) and () 
are clearly œ and 8, which takes us back to the scheme (130). 

It is seen that there is a complete isomorphism between the two descrip- 
tions of the operator S, and its eigenstates ¢: One in terms of matrices 
and eigenvectors, where the rule of operations is (132); the other in terms 
of linear substitution operators and eigenfunctions, where the rule of opera- 
tions is (180). 

The question now arises as to the structure of the operators Sz and S,, 
associated with the other two components of the spin.?® In endeavoring to 
construct them it is important to recall one significant fact concerning the 
ordinary angular momentum L: its components do not commute with one 
another. In fact (see eq. 7) 


Lely — LyLo = La Lyla — Lely = La 
Lily — Lola = thL, 


Let us assume that the components of the spin S, this being an angular 
momentum operator, must be subject to the same commutation rules. In 
terms of o rather than S, we postulate 


On0y — 0,0, = Moz; Oyo, — 0,0, = Bor; C202 ~ O22 = Qo, (11-133) 


These relations imply that an eigenstate of S,, e.g., a(s) or 8 (s), cannot bea 
simultaneous eigenstate of S, or S, (sec. 7). 

The construction of o; and cy, ez being given, is more easily performed 
in the matrix scheme. If we set ourselves the problem of determining two 
matrices oz and oy, which, when combined with o, of eq. (131), obey (133), 
we easily find that the answer is not unique. But certainly the solution 


or = C ) oy = ( 3) (11-134) 


is a possible one. The ambiguity here encountered permits just enough 
freedom to make possible a rotation of coordinate axes (see Chap. 15). 
Let us, then, accept (134) as our solution in matrix form. Clearly, cz 


. . 1/1 
has eigenvalues +1, eigenvectors =( i) and 3( i) ; oy has eigen- 
. 1/1 lf 
values +1, eigenvectors ay and 5 _). The observable values 
—? 


2 While we need only one spin coordinate, 8s, all three components of the operator 
must be introduced because they appear in the Hamiltonian and other operators. 
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of all three components S,, Sy and S, are therefore +/2. When these 
results are translated into the function language they read as follows. 
The equation oz¢(s) = 4¢(s) has two possible (normalized) solutions: 


h=1,  o(s) = Vila(s) + 868) | (a) 
L= —I, o(s} = Vila(s) — BCs)] 


The equation o,¢(s) = \¢(s) has two possible solutions: 
A=1,  ẹ(s) = V9la(s) + 186) 
A= 1, g(s) = VHla(s) — 186) 
The equation 7.¢(s) = A¢(s) has two possible solutions: 
h=1, $(s) = a(s) 
A= —1, o(s) = A(s) | 
If now we write the eqs. (135a) in the simpler form 
os + op =a tÊ, ora — osp = — (a — B) 
and solve these by adding and subtracting, we fnd 


(b) (11-135) 


o,a = B, Op = a 


The same procedure applied to (135b) and (135c) yields similar relations. 
Summarizing these results: The operators oz, cy, 9a may be represented 
either by the set of linear substitutions 


oza = 8, oya = i, Tæ = a, (11-136) 
Osp = a; op = ia; cB = B 


or by the matrices 


0 1 0 — l 1 0 
v= (1 a) a= i), v= (7 ) (11-187) 


For practical use, the set of substitutions is to be preferred. 
Note that the operators 
ot = Flos + toy) 
and o = $ (0s — ty) 
satisfy the convenient relations 
ota =0 o a= 8 
op =a o 6B =0 
They are sometimes called “ displacement operators.” 
We return to the consideration of the general state function of an elec 
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tron, which includes x, y, z and s as arguments. Such a function may 
certainly be expanded in eigenfunctions of cz, i.e., 


$(2,y,2,8) = play zals) + o—@y,2)6(s) 


Normalization now requires 
f oredr = 2 f o*ddrdydz = f (oto, + o%o_)dedydz = 1 


The operators d's, cy, 0, do not act on ¢, and ¢_ which are only functions of 
z, Y, 2; in other words, they commute with space coordinates. Thus, for 
instance, 


oyh(2,Y,2,8) = Typa + oof = 40 ya + o_o b = ih = ipa 


In the matrix scheme, ¢(x,y,z,8) is represented by the vector 


$ = (+ eu) 
p- (2,4,2) 
In the sense of this analysis it may be said that the introduction of the spin 


in the Pauli manner causes all Sehrödinger functions to become two-com- 
ponent functions. 


Problem. Carry out the algebra involved in finding the two Hermitian matrices 
(184). 

11.30. Applications.—a. Atom in a Magnetic Field. Our interest here 
is not in a complete solution of this problem, which may be found worked 
out in most books on quantum mechanics, but in its salient mathematic 
features. We wish to find the energies of a one-electron atom (e.g., hydro- 
gen or, with good approximation, the alkalis) when it is placed in a uniform 
magnetic field. The Hamiltonian consists of two parts, one acting on the 
electron’s space coordinates and one acting on the spin coordinate. The 
former will be called Ho; the latter is the “ spin energy.” If the magnetic 
field X is taken along the Z-axis, the classical energy of a particle of mag- 
netic moment p would be p: W = yp... But empirically, the magnetic 
moment associated with the spin is (ie/2mce)o. We shall here write u for 
the constant te/2mc. In quantum mechanical transcription, then, the 
“spin energy ” is Xoz where o; is interpreted as the operator (180) or 
(131). The Schrödinger equation becomes 

(Ho + uK :0:)¥ = EY (11-138) 
Let 
E (z,y,2,8) = 4 (x,y,z) a (8) + ¥~(z,y,2)8(s) 


and substitute, obtaining 
a(s)[Hy + pH, — EW + 6(s)[Ho — iK: — EW. = 0 
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provided relations (136) are used. Since a and £ are linearly independent, 
orthogonal functions of s, their coefficients in the last equation must. 
separately vanish.2° Hence we have 


Hol, = (E — KiW Hop- = (E +I- (41-139) 


Now let Eo be an eigenvalue of Ho, Yo the corresponding eigenfunction. 
The first of eqs. (139) (which is nothing more than an eigenvalue equation 
for the operator Ho) then says E — uH, = Eo or E = Eo + Ka Wy = Yo- 
On substituting this value of E into the second equation it reads Hop = 
(Eo + 2uiC,)~_, and this can only be satisfied by putting y_ = 0 because 
Eo + 2uJC, is not an eigenvalue of Ho. Thus we obtain as one solution 
of (138) 


E = Eo + us, v= Yo (x,y,2)a(s) (11-140a) 


But we can also start with the second of eqs. (139) and assume y_ to be 
vo, E + eK, to be Ey. Then y4 = 0 and we have 


E = Ey — Ka, © = vo (x,y,2)B(s) (11-140b) 


How does the inclusion of the spin modify the eigenvalues and eigen- 
functions of the Schrédinger equation when there is no magnetic field? 
The answer is obtained by letting X- vanish in (140a, b). Eoth values of Æ 
coalesce to Eo which now represents the ordinary Schrödinger energy in 
the absence of a field, but the functions ¥ remain distinct. The spin thus 
introduces a degeneracy into the Schrödinger representation of states. 
Formulas (140) account—in a primitive way-for the doubling of the 
alkali energy levels, the field IC, being caused in that case by the electron’s 
orbital motion, and not by external agencies. 


Problem. Solve eq. (138) by the method of separation of variables, i.e., by putting 
Y = ¢(x,y,2)6(s), and show that (140) is the solution obtained by that method also. 


b. A Spin Problem. Having shown how spin and coordinate functions 
cooperate in the description of the state of an electron, let us omit further 
reference to space coordinates and inquire what are the energies which an 
electron, placed in a uniform magnetic field of arbitrary direction, may 
assume regardless of its translational motion. The only cnergy of interest 
is that due to the spin. Let X be the magnetic field strength. The 
Schrédinger equation reads 


Hoy = uK eos +Kyoy +H,02)¥(s) = ECs) (11-141) 
Tf K is taken along Z, the equation reduces to 
Ko wW(s) = Ep(s) (11-142) 


29 This can be seen explicitly if the equation is multiplied by cither a(x) or p (a) and 
then ‘integrated ” over s. 
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The operator on the left is but a constant multiple of e, and must therefore 
have the same eigenfunctions as o,,i.e.,aand 8. The corresponding eigen- 
values are at once seen tobe E = +C. We shall show that eq. (141) has 
the same eigenvalues, but different eigenfunctions. 

Make the substitution y = aa(s) + 8(s) in eq. (141). On using, 
subsequently, relations (136) the result will be 


{IC (a8 + ba) — iH, (aB — ba) +H, (aa — b8)} — Elaa + bg) = 0 


As before, the coefficients of a and @ may be put equal to zero separately, 
so that 
aKa — iKa — Kb) = Eb} 
u(H.b HiK, b +H a) = Eal 


If the equations are to have solutions a, b, which are different from zero, the 
determinant of the coefficients of a, b must vanish, whence K = A- pt. 
On substituting E = +, into the first of eqs. (148) and then taking: the 
square of its absolute value, we have 


(11—143) 


(HE + 3G) a |? = (H +5)| d 7? 


Let us call the angle between H and the Z-axis, 6, so that C2 -+ FC? 
KH? sin? 0, and KH, = K cos 6. Furthermore, in view of (128), | b |? = 
L — |a|?. When these substitutions are made and the last equation is 
solved, the squares of the absolute values of a, b are found to be cos” 0/2 
and sin? 6/2, respectively. Let us then put a = cos 6/2, 6 = et? sin 6/2, 
treating 6 as a phase constant. With the further substitutions GC. = 
H sin 0 cos ¢, Hy = H sin 0 sin ġ, where ¢ is the azimuth of the field, we 
find from (148) that ô = —¢. 


In a similar way, when E = —ypH,6 = r — ¢,a = sin 0/2,b = —e? 
cos 6/2. 
We conclude that eq. (141) has the eigenvalues E) = wi, Eo = — uK, 


and the corresponding eigenfunctions 


8 6. 
y = cos 5 a(s) + sing “P83 (s) 
a 8 (11—144) 
vo = sin 2 a(s) — cos 3 e‘?B(s) 
Notice that, when the field JC is reversed in direction (ie, 92 m — 8, 


¢—¢ + r), Yı and ya exchange their roles. 
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Problem. Solve eq. (141) by diagonalizing the matrix 
HK: K, — iK 
Kata ta (Eag Sr 


and show that it leads to the same results. 


THE MANY-BODY PROBLEM AND THE EXCLUSION PRINCIPLE 


11.31. Separation of the Coordinates of the Center of Mass.—In 
classical mechanics, a system containing many particles and subject only 
to internal forces behaves in such a way that its center of mass moves uni- 
formly on a straight line. As a corollary of this theorem every classical 
two-body problem may be reduced to a one-body problem.” A similar 
fact may be proved in quantum theory. 

The Schrödinger equation for a systern of n particles of masses 
My, +, Mn reads: 


n Ke 

(-E2v + vy = Ey (11-145) 
1 2m; 

where Y? = 0?/aa? + a?/ady? + a?/dz?. The potential energy, V, is to be 

regarded as a function of the relative coordinates £j — £i, Yj — Ya Zi — Zi 

We first transform to a new set of coordinates, defined as follows: 


i R n 
X = — iti M = Mi 
mente x 
l=rn-X 
with similar relations for the y and z components. Note that xj is missing; 
the coordinates of one particle have been eliminated by the introduction of 
the center of mass coordinates X, Y, Z. In computing the sum of the 


Laplacian operators occurring in (145) in terms of the new coordinates we 
observe: 


(11-146) 
th=to-—X, t= rty- Xp 2 


ax; ay; a2; M’ ax; OY; a2; 


ay mi (= n ayp 2 ) 
— = — | — — 2 

dai M? \ax? Š axa t 2, ðxiðxi 
ay ms (= a oy OY ) 


ov ms 2 
ð MNOK? > AX OL; semen OL OL, 


n(A- Bo.) 4 BY 


OXdx, jaz dada} axi? 


30 So long as relativity effécts are neglected. 
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and similar expressions for the derivatives with respect to y and z. When 

these are combined we obtain, in place of (145), the equation 
he > n Ka n A2? n ( ð? 8? 32 ) ] 
-~-—V-ho Vite Vv 
ia D om * + aM ola t ala t adad) + YY 
= Ey (11-147) 


Here V? is the Laplacian with respect to the center of mass coordinates, 
Vi? with respect to the primed coordinates. While V is not directly a 
function of the primed coordinates, it may be expressed in terms of them 
because z; — 2; = zj — zi. A difficulty might seem to appear in connec- 
tion with z; — z£; because xj is absent from the primed set. But it is easily 


n n 
seen that mz! = ~ Emit, whence 2; — ti = zi + + En;zj. There- 
t3 
fore V, when expressed in terms of the new coordinates, will not contain 
4, Y, or Z. 
As a result, eq. (147) is separable; therefore y may be written as 
U(X,Y,Z) <p (ah +24), 
Correspondingly, E = E, + E’, where F, is the energy associated with 


W(X,Y,Z), determined by 
2 


i 2. 
-zy VY = Ee 


This is the Schrédinger equation of a free particle of mass M, it pro- 
duces, as we know, no quantization. The remainder of (147) describes 
the internal motion of the particles: 

(aveti Ev v+ vjen (11-148) 
5 2m; t 2M i i 2 @ 7 $ ) 
It differs from the normal form of Schrédinger’s equation by the presence 
of the terms in Vj - Vj and by the fact that V has a different functional form 
in the primed coordinates than in the unprimed ones. 

The coordinates (146) measure the position of the i-th particle relative 
to the center of mass. It is also possible to use a less symmetrical but 
physically more useful set of coordinates, which is closely related to (146). 
If we put 


12 n 
X = — izt M= ; 
mem zm 
Ta = f2 — ti, T4 = T3 — Zy tt, Li = £n — T (11-149) 


thus measuring all coordinates relative to that one which has been elimi- 
nated (z1), we obtain in the same manner the equation 


[-2v-be a” Ss Ly v| =F 11-150 
2M mm, V $ 2m, a V" Vit Vee =E% (11-150) 
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This form is particularly useful when it is desired to calculate the energy of 

amany-electron atom, for particle] may then be taken to be the nucleus and 
the summations in (150) are extended only over the electrons. The equa- 
tion remaining after separation of the motion of the center of mags is now 


i? 1 12 1 i , } 
T3 ZV: + maV V +Vi¢ = Eo 


where m is the mass of an electron, m, that of the nucleus. It may be 
written in terms-of the reduced mass 


mmi 


m+ m 


u= 


as follows: 


2 2 
f- Egyp -ŻE vi vi + vo = E’ (11-151) 
2p i 2m isj 

The terms in the double summation play an important role in the isotope 
effect of heavy atoms.* They are present whenever the number of elec- 
trons is greater than one, For the case of hydrogen, eq. (151) has the same 
form as Schrédinger’s equation for a stationary nucleus, except for the 
replacement of the electron mass by u. Hence the true energies of the 
hydrogen atom are not exactly given by eq. (64), but by that equation with 
u written for m. 

Note that the function V is different in (148) and (151), and that the 
terms of the double summation have opposite signs. Nevertheless the 
equivalence of these two equations for the two-body problem may be seen 
as follows. Write for the potential energy in (151) 


V = V(x y’), where’ = 2z — tı, ete. 
The V-function of (148) must then be expressed in terms 22 — X, y2 — Y, 
22 — Z. Now z — t, = (mm) (za — X). Therefore we must use 
1 


in (148) 


? 


m + me mı + ma my + Me 
y = V(r, -——— y, =y 
mı mı mai 


and the equation reads 


wf. 1 ) 12 A | bP of Ly pd ab ot 
aa m | e e maasaita ; ; Ws = ; 
[ = (= ma) + Vlar ay az) Ypa y a) = Eya yae) 


31 See Hughes, A. L.. and Eckart, C., Phys. Rev. 86, 694 (1930). 


11.32 QUANTUM MECHANICS 414 


where a = (m, + mg)/m. If here we put az! = z”, ay! = y", a2! = 2", 
it becomes 
h? 2 1 1 472 ‘ree ee I 
[ z 2 (2 au)? + Vey 2") |p = By 
which is identical with eq. (151). 

11.32. Independent Systems.—Physical systems are independent, or 
isolated from one another, if the Hamiltonian operator of one contains no 
terms referring to another system. There is then no interaction between 
them. Consider n independent systems, and let the coordinates of the 


r-th system (including the spin coordinate) be symbolized by the single 
letter g,. If its Hamiltonian operator is H,, its Schrodinger equation will be 


Ha? lg) = EW (gq) (11-152) 


ES being the i-th eigenvalue of the r-th system. 
The state function describing the entire assemblage of n systems will 
satisfy the equation 


(Hi + Ha + +++ Ha) ¥ (91,92, + + gn) = EW (Qi 92)" + + Gn) (21-153) 


To find its solutions we put ¥(q1,92," -` du) = ¥ (qi W (qa) < W™ (an) 
tentatively. Substitution in (153) and use of the fact that H, acts only on 
qı, etc., leads at once to the equation 
HYO | Hy? Hy 
yh yp”? ym 
which shows that each term H,y/p”? is separately a constant, say EB“, 
and that the sum of all these constants is E. But if H,y@/y™ = BO, 


then y“? must be one of the set of functions defined by (152), and BO 
one of the energies ES. Therefore 


Elada gn) = vr) «of laa) VO lan) 
E= EY 4 EP 4... B0 


(11-154) 


This result is indeed what intuition would lead us to expect. For 
clearly the total energy of a number of isolated systems is the sum of the 
individual energies. Furthermore, if w; is the probability that system 1 be 
found at qı, wz that system 2 be found at gz, then the probability that both 
of these statements be true simultaneously is the product wiwa. Hence 
the individual y-functions, whose squares are these probabilities, must like- 
wise combine as factors. 

This latter circumstance is dictated also by the time dependence of the 
Schrödinger states eq. (113). For only the product of the individual 
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functions ye W/E yO~e- WME! ote, will have the factor pT EPRE O 
required in Y (grga, ` ` qn)e EP. 

11.33. The Exclusion Principle.—When two independent systems 
occupy the energy states E! and E{? respectively, the combined system 
has an energy 


E = EP) + gP 
and a state function 
v= plq) vP (a) (11-155) 


We shall suppose for the moment that the individual states yP and y 
are non-degenerate. Then, unless there happen to be two energies EY? 
and E whose sum is precisely the same as Ef + EP, the combined 
state (155) will also be non-degenerate. This will generally be the case 
when the two systems are different in a physical sense. 

But if they are similar, e.g., both electrons, or both hydrogen atoms, 
another situation arises. We may then drop all superscripts in the de- 
scription of the states, and write (155) 


B= HE; +H, W = pl): vil) (11-156) 


This state is degenerate, although y; and yy are not; for if we interchange 
the indices i and j, or what is the same, interchange the coordinates g; 
and go in Y, there results a different ¥-function but not a different energy. 
This degeneracy, which is peculiar to the description of any aggregate of 
similar systems, is known as exchange degeneracy. Classically it implies 
that the energy of the total system is unaltered when two individual con- 
stituents exchange places and spins. 

In the more general case where E; has gi and Æ; has g; linearly inde- 
pendent functions associated with it, the number of Y's corresponding to E 
will be, not g.g;, but 2g.9;. 

Returning to the case of non-degeneracy of y; and y; we note that the 
two functions 


Wr = plal Yrs = Yla W:l) 


which are linearly independent, are equally gocd representatives of the 
state in which E = E; + H;. .Moreover, any linear combination of the 
two satisfies the Schrédinger equation for this value of FẸ, and has just 
claim to be considered. Of course, only two such combinations can þe 
linearly independent. Let us then consider the function 


av; + bVrr 


where we shall assume |a|?+ ||? = 1 to assure normalization. On 
exchanging ” the two systems, Vy; — Wiz and W;;—> Wrz, hence the 
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function above transforms itself into 


bv, + a¥yry 


the numerical value of which for any given configuration (¢1,¢2) will in 
general be different from av; + bWz;. Physically, this implies that the 
configuration which results when the two systems exchange places has an 
altogether different probability than the original, a consequence that is 
clearly objectionable. 

However, among all linear combinations there are two which avoid this 
dilemma. They are the symmetric’? combination 


Ug (ngs) = VE (br + Uz) 
and the ‘‘ antisymmetric ” one 
Yalang) = YEr — Yr) 


They are independent and indeed orthogonal; the first remains unaltered 
on exchange of systems, the second changes its sign. Both, therefore, yield 
probabilities | y |? which are insensitive to exchange. 

Consider now, not two, but n independent similar systems, in states 
vi, vj, °°+ Wo The assemblage has the energy E = E; + Ej; +e En 
and is described by the state function 


Planga °° Gn) = Wilgr¥a (G2) `- Ws (Gn) (11-157) 


But every permutation of the g's among the ¥’s on the right will produce a 
new function belonging to the same E, provided the subscripts, 7, j,+-+s 
are all different (which we shall assume for the moment). Hence, if P 
represents any one of the n! possible permutations of the g’s and 
Vp (q1,92," °* Gn) the function which results from (157) when this permu- 
tion is made, then f 


Elada e Qn) = Laptp (11-158) 


where the ap are arbitrary constants, one for each permutation (arbitrary 
except for the normalization condition), represents an acceptable state 
function for the energy E. Since there were originally n! linearly inde- 
pendent functions, there will also be n! linearly independent combinations 
of the type (158). 

Fortunately, most of these are uninteresting, for they cause 


| ¥(91,92)° n Qn) |? 


82 A function is said to be symmetric with respect to a given operation if the opera- 
tion leaves it unchanged; it is said (in quantum mechanics) to be antisymmetric if the 
operation changes its sign without altering it in any other way. 


417 THE EXCLUSION PRINCIPLE 11.33 


to change when an exchange is made among any of the q’s. There are 
certainly two combinations, however, which preserve probabilities on 
exchange. One is the symmetrical, the other the antisymmetrical combi- 
nation. ‘The symmetrical one is formed by making all the coefficients ap 
in (158) equal: 


Fs (91:92)°* Qn) = (nt) Lp (11-159) 


the antisymmetric one by giving opposite signs to even and odd permutations 
(ef. Chapter 15): 


Wa(gisga + Gn) = aY "L(- 1)" be (11-160) 
A practical way of constructing (160) is to write the determinant 


¥i(q1)¥e(Ga)¥i(gs) -+ + Vilan) 
Vila dhs (g2)¥5 (aa) ++ Valan) 


Pr 


Ya = (nly? (11-1607) 


er 


ws(91)¥2(G2)¥e(Gs) _ ' Yalgn) 


which the reader will easily recognize as equivalent to the expansion (160). 

It is to these two functions, Yg and Wy, that we must confine our 
attention. Lest the simplicity of our formalism obscure significant details, 
we recall that q, stands for all coordinates of the r-th system. Thus, if 
the systems were electrons, ¥;(g,) would be an abbreviation for a combi- 
nation of space and spin functions: 


Pj4(Lr,Yrser a (84) F Pi (Er Yrs2r) 8 (sy) 


in the notation of sec. 29, and an interchange of qy and qp means that 2, 
is to be exchanged against £p, Yr against Yp, Zr against Zp and s, against sp. 

There is no a priori way of deciding which of the two functions, (159) 
or (160), is preferable. But here the exclusion principle, early recognized 
by Pauli, creates simplicity in a most effective way. It states that if the 
individual systems belong to a certain class (see below), only antisymmetric 
functions may be used in describing the assemblage. This principle is of the 
nature of a postulate; it has not yet been deduced from more fundamental 
axioms, although one might hope, from a mathematical point of view, that 
this will prove possible“? Why nature insists upon antisymmetric states 
for some and symmetric states for others among its creatures is at present 
a puzzle. 

The elementary systems to which Pauli’s principle is known to apply 


33 A very searching and interesting examination of the principle in the light of other 
fundamental issues has been given by Pauli, Phys. Rev. 58, 716 (1940), 
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are: electrons, positrons, protons, neutrons, neutrinos and mu-mesons; 
photons, on the other hand, and several kinds of meson, are deseribed by 
symmetrical state functions. 

Perhaps the most important consequence of the exclusion principle is 
this. Suppose our assemblage consists of electrons, two of which are 
described by the same function y; (i.e., the functions are identical with re- 
spect to positional and spin factors). The determinant (160’) will then 
have two equal rows, and hence will vanish. We may therefore say: two 
systems obeying the Pauli principle cannot be in the same state. This fact 
governs the structure of atoms and molecules; each electron added to the 
shell of an atom must have its own set of quantum numbers. 

The exclusion principle makes it impossible to distinguish two states 
which differ only by an interchange of two constituent systems, a fact which 
hag already been noted. 

Photons, which are described by the symmetrical function (159), may 
exist in identical states, because that function does not vanish when two 
sets of indices liked and j, contained in Yp become equal. 

11.34, Excited States of the Helium Atom.—To show how the Pauli 
principle is applied we treat some of the excited states of the helium atom. 
The latter is to be regarded as a simple assemblage of 2 electrons moving in 
the Coulomb field of the nucleus (and under their mutual repulsion), 
hence the considerations of the foregoing section apply. However, in the 
first part of our treatment we shall ignore both the electron spin and the 
exclusion principle. 

The Schrödinger equation has already been given (eq. 90); it is 


e 
(z + Ha + Z) Y= EY (11-161) 
12 
where 
po Ëy 
2m T; 


{f the term e2/r12 were absent the two electrons would be independent, and 
y would be a product of the form ¥i(q1) ` ¥;(q2), E being E: + E;. More- 
over y; and y; would be hydrogen eigenfunctions with atomic number 
Z = 2, for Hy and Hz are Hamiltonian operators for a single electron in a 
Coulomb field. To retain the notation of sec. 19 we shall now write u for 
the individual electron functions, so that, in the absence of the interaction 
term, 


P = us (21Y121 JU; (22Y222) (11-162) 


Functions of this type will be used as variation functions with the com- 
plete Hamiltonian (161). Let us first give thought to the proper choice of 
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the individual functions u. The state corresponding to the lowest energy 
of a single electron is (cf. eq. 67a) 
2N ) 
Uo = (=) eres (11-163) 
We are writing here, in place of the single subscript 7, the values of the two 
quantum numbers n = land! = 0. The first excited state is either 


uz = Rap Yo Oe) 


or 
uo, = Rar Vi (6,¢) 


The spherical harmonic Yo is a constant, but Y; is any linear combination 
of the three functions P! (cos 6@)e”, P2(cos 0) and Pi (cos #)e~**. It will be 
convenient to choose the following normalized combinations 


— [È rpa ay ai? 1 ie, _ J2 : L J z 
Y, = ior [Pi (cos @)e + Pi(cos 0)e™™] = in sin 8 cos p = roa 
Y, =-i JÈ [Pi (cos 8)e® — Pi(cos 0)e7™] = 3 sin 6 sin 
v lôr! 1 år ? 
~ 24 
~ N4or 
3 3 3 2 
= — = _ 8 = = 
Y, V2 P? (cos 6) NG cos In 
and to define** 


uzo = RaoYo 


(11-1645 


as the four independent, orthonormal functions describing the first excited 
state of the one-electron system. The product (162) can be formed by 
combining u1 with any one of the four functions (164); furthermore, the 
arguments can be interchanged in each of the functions thus constructed. 
We are therefore concerned with the following eight functions, each of 
which is a solution of eq. (161) with the term e?/r1g deleted, and belongs 
to the energy 


2e? 
Ey = -7 (1+ +4) = Egu (11-165) 
Q 


% Rg is given in eq. (65); its explicit form will not be needed here. 
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Yı = Uz9(1) 29 (2) Ya = Uao(1)u0(2) 
Ya = U1o(1)u22(2) Ya = Ugr(1)u19(2) _166 
bs = tro (1)u2,(2) Ye = uliul) | 249°) 
v7 = Uyo(1) ue, (2) Wg = Usz(1)u19(2) 


In writing them we have indicated the arguments (x,y,21) and (2c22/22) 
simply by (1) and (2). A combination of these functions 


8 
t= Yan 
Ag=1 


will be used as a variation function in the sense of sec. 20. The best ener- 
gies of the system are given by (97), and this reduces at once to the form 
(104) because the y) are orthonormal and belong to the operator 
H? = H, + Ho. The perturbing term is 


2 
i 


= & 
riz 

The next step in the solution of our problem is the calculation of the 
matrix elements f PTH dx dy ,dz,dxedyodz, using the functions (166), 


the details of which may be left for the reader.” Symmetry arguxments 
may be used to show that 


, , L yt , oy! L yt 
Hii = Hx, H33 = His Hos = Hos, Hy = Hag 


and that only functions in the same line of (166) give non-vanishing: ele- 
ments. Furthermore the volume element adopted in the evaluation. of J 
(sec. 19) is convenient in proving: 


t Po t. t A 7 
Aas = Hss = Hn; Has = Hss = His 


Since the y are real, Hj; = Hi. We are left, therefore, only with the 
following magrix elements: 


2 
Hi, = farao © hg 
T12 
2 
Hi = finar) — uo (1)uio(2)dr = K 
e? 
His = [ARDE dr = J! 
12 


2 
Hy = fio) - Yaa (l)uro(2) = K’ 


5 See Heisenberg, W., Z. Phys. 39, 499 (1926). 
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In a sense previously defined, (see sec. 11.21) J and J’ are Coulomb inte- 
grals, K and K’ exchange integrals. 
The determinantal eq. (97) becomes 


J—e K 0 0 


K J-e 0 0 
0 0 J-e K’ 


0 0 K! J'-e =0 
J-e. K 0 0 (11-167) 
K! J'—e 0 (H 
0 0 J’ —e K’ 
0 0 K’ J — e 


provided we write efor E — Eo. All elements not written are zeros. The 
determinant has two single roots: « = J — K, e= J+ K’ and two 
triple roots: e = J’ — K’, «a = J’ +K’. The perturbation e?/ry2 may 


Ea 


E, 


Fra. 11-6 


therefore be said to change the one unperturbed level Eo into four per- 
turbed levels: Eo + e, Eo + ez, Eo + es, Eo + «s, as indjcated qualita- 
tively in the diagram (Fig. 6). 
To find the functions corresponding to the eight roots « we must return 
to equations (96): 
alJ — ¢) +a, = 0 
aK + a,(J —«) = 0 
alJ! — «) + asK' = 0 


aK’ + LACH _ €) = 0 ete. 


|] 


On substituting e for e we find az = —a;, dg = a = +++ = ag =0. On 
substituting e = ez, we find ag = a1, @g = a = ++: = ag = O, and so forth. 
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We thus obtain the set of energies and normalized variation functions given 
in the first two columns of Table 1. 


TABLE 1 
E 4 
Eo+J-K Vii — v2) a Triplet 
Bot J+k Vi + ya) 8 Singlet 
Vis — ya) a Triplet 
Eo + J’ -K VIs — vs) a Triplet 
VE Ws — ps) a Triplet 
VE Ws + va) 8 Singlet 
Eo+ J’ +R! Viis +y) 8 Singlet 
Viy + ye) 8 Singlet 


It now becomes necessary to include the spin into our analysis. To do 
this accurately would require a modification of the Hamiltonian operator 
(161), for the magnetic moments of the spinning electrons produce an 
interaction with the magnetic field due to their orbital motions and this 
interaction has not been included in (161). We shall omit this spin-orbit 
interaction and refer the reader to the literature for the more accurate 
treatment.2® In other words, we shall suppose that the Hamiltonian does 
not act on the spin coordinates. The state function is then separable and 
appears as the product of an orbital (any of the functions in the table) 
and a spin function, and the latter may be taken as an eigenfunction of cz 
for each electron. Let us consider these spin functions more closely. 
For the two electrons, we have four functions: 


a(s;)a(s2), o(81)B(s2), B(s1)e(s2), and 6(s1)8(s2) 


These, however, do not have convenient exchange properties, for when sı 
and sə are interchanged, the first and last remain unaltered, the second 
transforms into the third and the third into the second. But it is possible 
to construct from the second and third two other, equivalent functions, 
which are symmetrical and antisymmetrical with respect to an exchange of 
spin coordinates. They are, when normalized, V 4[@(s:)8(s2) + B(s1)@(s2)] 
and W4[a(s1)6(s2) — B(s1)2(s2)]- We have in this way obtained four 
spin functions 


Z1 = a(sy)e(se), Z2 = V pla(s1)8 (82) + B(s1)a(s2)], Ze = B(81)B(s2); 
A = V¥la(s1)8(s2) — B(si)ex(s2)] (11-168) 


36 Condon, E. U., and Shortley, G. H., ‘‘ The Theory of Atomic Spectra,” Macmillan 
Co.. New York, 1935. 
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the first three of which are symmetrical, only the last being antisymmetri- 
cal. Furthermore, this sct of functions is orthogonal (and complete). 

To include the spin we need only multiply each one of the functions in 
Table 1 by one of the spin functions 2, to A, a procedure which yields 32 
different functions of position and spin coordinates. But here the exclusion 
principle effects a great simplification. It says that only functions which 
are antisymmetrical when all coordinates, i.e., position and spin coordi- 
nates, of the two electrons are interchanged, are to be permitted. Hencca 
function of Table 1 which is symmetrical can only be combined with A, 
and a function which is antisymmetrical only with 21, 2s and 23. 

Now the functions marked a in the table are antisymmetric; they can 
be multiplied by any one of the three S-functions. Each of them corre- 
sponds, therefore, to three states. For this reason the energy states 
Eo + J’ — K' and Eo + J — K are said to be triplet states. If spin- 
orbit interaction had been included in our calculation each of these levels 
would have appeared as three closely adjacent levels, while the other 
energies, marked singlets, would have remained single. 

Tt is true that the functions in Table 1 are only approximate solutions 
of eq. (161). Nevertheless what we have said about their symmetry 
with respect to exchange of electrons may be shown to hold rigorously. 
The structure of the helium energy spectrum, and in particular the singlet- 
triplet character of the states, are therefore correctly given by the simple 
theory of this section; the numerical values of the energy levels will be in 
error. 

The normal state of the helium atom, whose energy was computed 
approximately in sec. 19 of this chapter, is given in the present notation by 
uo (L)uio(2), if we neglect the spin. It is clearly symmetrical and can 
only be multiplied by A when the spins are introduced. Hence it is a 
singlet state. When the helium atom is in a singlet state, its probability 
of passing into a triplet state under emission or absorption of radiation is 
very small, as may be shown by an extension of the methods used in 
sec. 11.28. .Hence triplet and singlet levels do not “í combine,” and heliun3 
may be said to have two distinct spectra, the triplet spectrum to which 
spectroscopists apply the term “ orthohelium ” spectrum, and the singlet 
spectrum called “ parhelium ” spectrum. 

Problem a. Instead of using the 8 functions (166) as linear variation functions, 
start with the 32 functions obtained from (166) by multiplying each of them by 21, Z2, 
Z} A. Show that, if these 32 functions are suitably arranged, the determinanta: equa- 


tion is a four-fold repetition of the one obtained above, and that it yields the same 
results in regard to both energies and functions. 


Problem b. The following spin operators for two electrons may be defined: 


og = on T 022 2 2 2 2 2 2 
a = (o, Ho)? = oa + op ton + en ton + om + Qlowiene + oyity2 + cater) 
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where oz is the operator o+ acting on spin coordinate sı, etc. Show that 2, £2, Z3 and 
A are all eigenstates with respect to both of these operators, in particular that 
gee, = 221, o2E2 = 0, 0223 = —22%3, o2A = 0 
aE, = 82), oS. = 8E2, o°Z3 = 8Es, oA =0 
Are these results consistent with the classical interpretation according to which £, 
is the state in which both spins are parallel and along Z, 
Ze is the state in which both spins are parallel and perpendicular to Z, 
23 is the state in which both spins are parallel and along —Z, 
A is the state in which both spins are opposed and yield no resultant angular momentum? 


11.35. The Hydrogen Molecule.—One of the stumbling blocks of pre- 
quantum chemistry was the phenomenon of homo-polar binding; it is 
impossible to explain on the basis of classical dynamies the union of two 
hydrogen atoms to form a molecule. The only attraction which two 
neutral structures like H-atoms could possibly exhibit was due to quadru- 
pole forces, and these were known to be too weak to account for molecular 
binding. It was shown by Heitler and London that the homo polar bond 
is caused by a typical quantum-mechanical effect: the “ exchange ” of 
the two electrons. Its meaning will be clear from the following discussion. 

The method of calculation®” to be employed is a simple one which lays 
little claim to quantitative accuracy*® but exposes the significant facts in a 
beautiful way. It is similar to the treatment of the Hf-ion, from which it 
differs by the presence of two electrons instead of one. The coordinate 
system to be used will be clear from Fig. 7; particles 1 and 2 are electrons, 


Pie 92 


Fie. 11-7 


A and B are the protons whose positions are regarded as fixed. In connec- 
tion with Fig. 7, we also wish to outline the use of a coordinate system and 
a volume element which are very convenient in the numerical work involved 
in this problem. 

The coordinate system for the two electrons will contain the six variables 
Ai, By, Ba, ri, Pl, 2; 


| Ba — By | < na < By + Ba, 0<B,< a 
|B ~R] <4, 5B, +R, 0 <B <0 


3? Heitler, W., and London, F., Z. Phys. 44, 455 (1927). 
38 The most elaborate and accurate calculation, also employing the variational 
method was made by James, H. M., and Coolidge, A. S., J. Chem. Phys. 1, 825 (1933). 
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The volume element dr = dridra, where 


dr, = AfdA, sin O:déidey 
Now 
B? = AÈ + R? — 2AiF cos bı 
whence 
2B,dB, = 2A R sin 4,d6, 
On eliminating sin 6,40, from dry by means of this last relation, we find 


1 
dry = R AidA,BidBy dex 


The element dro is obtained by writing down an expression similar to d7, 
but using Bı as base line: 


l 
dre = 7207 1 2Bed Bode 
By 
Hence the product dridte is 
l 
dr = p 4104 1BodBari2dridBiderdea (11-169) 
Several similar volume elements can be constructed by the same method. 


After this excursion, let us consider the Schrödinger equation of the 
H.-problem. It is 


_ R o 2 GZ 1, 1 1 3) 
Hy = zp v1 + V2) e LTR A D m R y 


= Ey (11-170) 


We endeavor to solve it by the method of linear variation functions, choos- 
ing as constituents of the trial function simple but reasonable approxi- 
mations to the correct y. If H did not contain the last four items in the 
parenthesis multiplying e? it would simply be the sum of two hydrogen- 
atom Hamiltonians, and 


y = ua (L)up (2) 
where 


ua (1) = (a) ~ 2g 41/20, up (2) = (a3) ~ 2g 82/20 


are hydrogen functions centered about A and B respectively. On the 
other hand, if the terms 1/4; + 1/Bz — 1/riz2 — 1/R were missing from 
the parenthesis, H would also be the sum of two hydrogen-atom Hamil- 
tonians, but y = ua(1)us(2). Both of these y’s are equally good approxi- 
mations, and both must be included in the trial function. Note that they 
differ with respect to an exchange of the electrons (or, what amounts in this 
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problem to the same thing, the protons). Hence we adopt 
® = cya (1)up(2) + egup(1)u, (2) (11-170a,) 


as variation function in minimizing J Hdr. As explained in see. 20, 


the process leads to the secular equations 


€1( Har — AnE) + (Kiz — AE) = 0 } (11-171) 
€1( Har — AnH) + e(z — AE) = 0 
and Æ is given by 
Ki — Ant Kiz — AE 
=0 11-172 
Ka — AoE Hee — Ago# ( ) 


Here 


Ai = AOA = Age =] 


2 
A = AO (2)dridra = (Jea auan) = Az 


The latter integral is familjar from sec. 21, it is the quantity there called 
Aas. Hence 


N2 R 
Aiz = Agi = (itet), p= 
GA 
Next, we turn to 
Kı = fus (1)ug(2)Hua (1)uz (2)dridre 


The V?-terms in -H need not be calculated; their effect upon ua(l) and 
up(2) is at once obtainable from the differential equations which these 
functions satisfy: 


K? e? 
- ap Vall) = g + = us), 


i? e? 
— z— Viug(2) = (2x +£)un (2) 


2m 
In this way we find 


2 
Hu = y+ QW +I +5 
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where 
> 2 2 _ 3 w (1) 
J = -ê f ROAOB dnin = -2 fea, uri) 
1 
and 2 a 
1)u3 (2 
J =e f va us (2) Jat ) Indra (11-174) 
12 


J is given in sec. 21, eq. (100), and J ’ has the value 


2 3 
1E - ( il 3 2 =)! 
J ar e L+yetge rs 


Problem. Prove this result, using the system of coordinates and the volume element 
(169). 


Furthermore, 


Hoe = Kn 


as the reader will easily verify. In asimilar way, 


2 
Hig =Ka = 2E gå + 2K +K’ + 5 Aye 


where 
K=- f ua (Lua ()Bridn (11-175) 
and 
p'o fOO One® lndra (11-176) 
T12 


The value of K is given in eq. (100), and 

+f [A(y +i p) — 2V AA Bi(— 2p) + SBi] 
where y = 0.5772 (Euler-Mascheroni constant), 

A=Ay, AE a(i — p +8) 
and Ei(«) is an abbreviation for the exponential integral 


Fila) = f — day, 


which is tabulated and discussed, for instance, in “Tables of Sine, Cosine 
and Exponential Integrals,” Federal Works Agency, New York, 1940. 
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Problem. Evaluate A’. See in this connection, Sugiura, Y., Z. J. Phys. 46, 484 
(1927). 


The two roots of (172) are 


K 2 9 ! L 9RKAN2 , 
m, -Xu tHe op, po 4 ET KA ER 
_ : tA 11-177) 
py oir =H op, pf g Md ka V 
ae 1-A OR [<A 


Substitution into (171) shows that to Æ, there corresponds the function 
$, = [2(1 + A) lua (1)up (2) + up (ua 2)] (11-178) 
and to Es the function 
d = [20 — A) [ug(1)up(2) — up(L)ua(2)] (11-179) 


The energies E and Hy are plotted against R, the internuclear distance, in 
Pauling and Wilson.®® It will be seen that Ey has a minimum in the 
neighborhood of the experimental internuclear distance of the Hp-molecule; 
at this minimum Æ; is negative and equal in order of magnitude to the 
experimentally known minimum which causes the stability of the molecule. 
On the other hand, E> is positive for all R, decreasing in monotone fashion 
with increasing R. It, therefore, corresponds to repulsion between the 
atoms. Comparison of E; and E shows the difference in their behavior as 
functions of È to be predominantly due to the presence of the K and K’ 
integrals. These would have been missing if electron exchange had not 
been taken account of by introducing the two functions constituting the & 
of eq. (170). In that case also, there would have been only one energy and 
not two. Now while (170) may be a crude approximation, the fact that 
two equivalent functions, differing only with respect to electron exchange, 
will compose the correct solution of (170) is beyond doubt, hence the quali- 
tative aspects here obtained cannot be questioned. The integrals K and 
K' are called exchange integrals. 

Let us now include the spin and apply the Pauli principle. The spin 
functions are those already encountered in the helium problem, eq. (168). 
If the resultant function is to be antisymmetrical, #1, which is symmetrical 
in the position coordinates of electrons 1 and 2 must be multiplied by an 
antisymmetrical function of the spins, of which there is only one, namely A. 
However, 2) may be muitiplied by one of the three functions Z1, 2z or Za. 
It represents a triplet state while $, is a singlet. 

To the energy Hs, therefore, there correspond three times as many 
quantum mechanical states as to Ei. From this fact may be drawn the 


39 Loe, cit., p. 344. 
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conclusion that when two H-atoms approach they will, ceteris paribus, be 
three times as likely to repel as to attract each other. 
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CHAPTER 12 
STATISTICAL MECHANICS 


412.1. Permutations and Combinations.—The purpose of the present 
chapter is not primarily an exposition of the ideas of statistical mechanics, 
which is available in several modern texts,’ but a brief and summary review 
of the chief analytical techniques used in the treatment of this subject. 
We begin by discussing the principal formulas of the theory of combinations. 


a. The number of possible permutations of z different (distinguishable) 
objects is 7! 

The proof is simple: the first object can be put in n different positions. 
When its place is fixed, n — 1 different positions are left open for the sec- 
ond. Hence these two objects can be arranged in n(n — 1) different ways 
without disturbing the relative order of the remaining (n — 2) objects. 
But the third can occupy n — 2 different places, and so on. The total 
number of possible arrangements is therefore n(n — 1)(n — Q).--2=n! 


b. Suppose we wish to arrange the n objects in r piles, the number in 
each pile being prescribed. Let the number of objects in the first pile be 


ny, that in the second ng, ete., so that È ni = n. Itis desired to find the 


number, M, of possible arrangements of ‘this kind. If M is multiplied by 
the number of possible permutations of all objects in the first pile, then by 
the number of possible permutations of the objects in the second pile and 
so on for all the piles, we must obtain the total number of permutations of 

n objects. Thus 
Mnying!--- nl = nt 

whence 
nt 

~ Nina! r ny! a2 1) 
There is another combinatorial problem which leads to the same result. 
Suppose the n objects fall into r classes, the objects in each class being alike 


1 Tolman, R. C., “The Principles of Statistical Mechanics,” Clarendon Press, 
Oxford, 1938. Chapman, S. and Cowling, T. G., “ The Mathematical Theory of Non- 
Uniform Gases,” University Press, Cambridge, 1939. Mayer, J. E. and Mayer, M. G., 
“ Statistical Mechanics,” John Wiley and Sons, 1940. Lindsay, R. B., ‘ Physical Ste- 
tistics,” Jobn Wiley and Sons, 1941. 
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(indistinguishable). Let the first class contain nı objects, the second na, 
etc. The number of possible distinguishable arrangements of the n objects 
will then be seen to be obtainable by the reasoning employed above. 
Hence M represents also the number of arrangements of n things groupable 
into r classes, the members of each class being alike. 


c. The number of ways in which m objects can be selected from a set of 
n objects is n!/[m!(n — m)]]. This follows at once from (1), for a with- 
drawal of m individuals is equivalent to an arrangement of the n objects 
into two piles, one containing m, the other (n — m) objects. We note that 


this number 
— (") 12-2 
min — m)! \m ( ) 


It is often referred to as the number of combinations of n things taken m at a 
n n as 

time. We observe that, since ( ) = (, A) it is equal to the number 
m — 


of combinations of n things taken n — m at a time. 

Eq. (2) also provides the answer to another, apparently different ques- 
tion. Assume that we have n boxes, and a smaller number, m, of indis- 
tinguishable objects to be placed in them in such a way that no box con- 
tains more than one object. The number of ways in which this can be 
done is given by (2), for the assignment of m objects to n boxes is entirely 
equivalent to the selection of m objects from a set of n objects. 


d. When in accordance with theorem (c), a certain selection of m 
objects has been made, a permutation among these m objects does not 
produce a new combination. It does, however, produce a new arrange- 
ment. Thus, to every combination given by eq. (2), there correspond m! 
arrangements of the m objects. The total number of arrangements of n 
things taken m at a time is therefore 


(") = I 12-3 
m)” (n—m)! (12-3) 


If, in the problem of placing m objects into n boxes (n = m) discussed 
in (c), the objects are assumed to be distinguishable, so that our interest is 
no longer merely in the individual boxes each of which contains an object, 
but also in the arrangement of the individual objects placed in them, 
eq. (3) is applicable. It expresses the number of ways in which m dis- 
tinguishable objects can be placed in n boxes, zero or one object per box. 


e. Let us now determine the number of ways in which m indistinguish- 
able particles can be put into n boxes. Suppose that the m particles were 
placed along a line in any manner whatever and that (n — 1) partitions 
were used to separate the particles. If one more partition were then placed 
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at the end of the line, the particles could be regarded as having been placed 
into n boxes. If, therefore, we consider the m particles and the (n — 1) 
partitions, visualized as walls, as a set of (m + n — 1) objects, our problem 
becomes tantamount to finding the number of ways in which (n — 1) walls 
can be arranged among the totality of (m + — 1) objects. This num- 
ber, from sec. c and eq. (2) is 


n+tm-1\_ ntm- 
Cip \=( no (12-4) 


The preceding result is obtainable in several other ways, among which 
the following is sometimes given. Suppose that there are n boxes and m 
objects, as before. The first box can be selected in n ways, leaving 
(n +m — 1) boxes and objects which can be arranged in (n + m — 1)! 
ways or a total number of n(n + m — 1)! arrangements. However, per- 
mutations of boxes or particles among themselves do not correspond to 
recognizably different arrangements. Since this last number is n!m!, the 
desired number is again given by eq. (4). 

In the mathematical literature, the result of eq. (4) is sometimes known 
as the number of “combinations with repetitions.’ We note that it equals 
the number of combinations of (n + m — 1) things taken m at a time, 
where repetitions are not allowed. 

A recursion formula for the case with repetition is sometimes useful. 
If there are three objects, taken two at a time, it is found that there are 
six possibilities: (aa, ab, ac, bb, be, cc); if taken three at a time, there are 
ten cases: (aaa, aab, aac, abb, acc, bbb, bbe, bec, ccc). By mathematical 
induction, it is easy to show that for n objects taken m at a time 


n+m-—i 
Cm (nm) = ————— Cn (n) (12-5) 
m 
If m is given the successive values 1, 2, 3, +5, k and the equations multi- 


plied together, the result is the now familiar one of eq. (4). 

f. The number of ways in which m distinguishable objects may be placed 
in n boxes is clearly n” for the first object can be put into n places; with 
each of these dispositions of the first object can be combined n dispositions 
of the second object, and so on. 


12.2. Binomial Coefficients.—The coefficients C) appear in Newton’s 


famous binomial expansion 


a+b" -É (") ath" (12-6) 
imo \ É 
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where fis an integer. Its proof is fairly obvious, since the number of ways 
in which ż factors a and (n — t) factors b can be selected from n factors 


(a + b) is C) ways, by virtue of (2). We note that 
n n n 


À) =0 ift>n, this because (n — 1)! = œ 


also 


Most of the relations to be studied here are valid for non-integral values of 
n provided we define 


(j)-- T 
t} 1-2---¢ 


An important series in binomial coefficients may be obtained as follows. 
, n+ k 
In view of (6), y 
(a+b)***. But 


eroest- [ETEC] 
O 


The coefficient of a'b”t*-" in this double sum is ottained by putting 
i+ s = rand summing overt. Hence 


nek afr k ) 
(FY-EC), a2 
This is known as the addition theorem of the binomial coefficients. . 


From it, numerous other relations can be derived. 
On putting k = 1, we have 


(7 )-C+G2) 
OOC- 


) is the coefficient of a"b”t*-" in the expansion of 


t 
= 
8, 
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If we observe that 
rjo r 


we may also put k = —1 in (7), obtaining 


(7-0-0) 
r t=0 t 


If, in Newton’s formula, we let a = b = 1, we find 


A0)? 


bon (y) = 0 


12.3. Elements of Probability Theory.—An aggregate of elements, such 
as a set of observations, a sequence of results of some operation {e.g., 
throwing a die), is called a probability aggregate if it is permissible to apply 
the rules of the probability calculus to the aggregate. Whether or not this 
application is proper is usually decided on the basis of intuition: it seems 
clear that the decimal expansion of the fraction + does not form an aggre- 
gate of digits to which probability considerations may validly be applied; 
on the other hand, no hesitation is felt in subjecting the outcome of a series 
of throws of a die to probability reasoning. In the former sequence 
(142857142857, ete.) the digits occur with too much regularity to be 
regarded as “distributed at random.” The criteria for randomness, 
which decide whether an aggregate is a probability aggregate, may be 
stated with considerable precision” but will be omitted here. 

Every element is regarded as having one of a number, s, of distinguish- 
able properties. (Each throw of a die is an element, the number appearing 
uppermost is a property; s = 6. In measuring a physical quantity, each 
measurement is an element, each measured value a property; s may be 
infinite in this example.) If n, is the number of times the i-th property 
occurs and n the total number of elements, 

Te 
n 


but if a = —b, 


is defined as the relative frequency of the i-th property. By the probability 
of the i-th property is meant the limit 


lim £ = w; (12-8) 


2 See, Lindsay, R. B., and Margenau, H., “ Foundations of Physics,” John Wiley 
and Sons, 1936. 
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The existence of this limit is a matter which has given rise to considerable 
discussion; it will here be assumed.? The totality of the w; is called the 
distribution of the probability aggregate. Obviously, 2iw; = 1 


4 

The properties may be discrete (throwing dice) or continuous (value of a 
physical quantity, such as position of a particle). In the former case the 
distribution is sometimes said to be arithmetical, in the latter case, geomet- 
rical. In the continuous case a different formulation of probability is more 
convenient. Let x denote the continuous property. The probability 
wz, defined by (8) is clearly zero, but the probability that x shall lie between 
xz and z + Az is finite and is, moreover, usually proportional to the range 
Az provided this range is sufficiently small. Hence we may write for this 
probability 

w(x) Az 


and the function w(x), which does not have the physical dimension of a 
probability (a pure number) is called the probability density. Clearly, 


foa =1 


if the integral is taken over the entire range of properties. 

When a distribution w; or w(x) is given, certain expressions frequently 
occurring in statistical theories can be calculated. We present the most 
important of these, using parallel formulations for the arithmetical and 
geometrical cases. To make this possible, we write w(z,;) for the former 
w; thus letting x; represent the i-th property. 

If f(x) is a function defined for every x; (or x) which has a non-vanish- 
ing probability, the mean of f(x) with respect to the distribution w(z) is 


given by 
LY ew (az) 


| fte@we@a 
The dispersion of f(z) with respect to w(x) is defined by 
Ele) — J Puta) 
Q= | 
fire) -Ffo 
On taking for the function f(z) the variable z itself there results 
22 (a5) 


5 fewa 


3 For further remarks see Lindsay and Margenau, loc. cit., Chapter 4. 


(12-9) 


D 


(12-10) 


(12-11) 
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Lila: — B)?w(m) 


D(z) =P=. * (12-12) 
fe — #) w(z)dr 


The quantity o? is called the dispersion of the distribution w(x), o is 
known as the standard deviation. As is clear from its definition, o is a 
measure of the spread of w(x) about its mean. If w(x) were regarded as a 
distribution of mass, ¢ would represent its radius of gyration. By the 
r-th moment of the distribution is meant the quantity 


: ATC) 


25 | [reod 


For distributions with an infinite range of properties, higher moments do 
not always exist. The dispersion of w(x) may be expressed in terms of its 
first and second moments. In view of (12), 


eae -Ie+P a0? —-# 


Under certain conditions it is possible to expand a geometrical distribu- 
tion in terms of its moments, provided these exist. For simplicity we shall 


take these moments about Z as origin, so that # = 0, x? = o7, ete. One 
can then prove* that 


w(e) = 7 cere +E SH, E) 


ov ar f=3t 


where H; is the i-th Hermite polynomial, and 


zê z$ 3 rž 10 z? 
ĉ3 = -73 t= 7 Cs = 7% -7 ce = 
oe” gt ’ gê g? 


4 


~ 1573 + 30 


| 


This expansion is particularly useful when w(z) does not depart too 
greatly from a normal “ Gauss ” distribution: w(x) = eT? Pi] oN 2m. 


Problems. Two geometrical distributions of considerable interest in physics and 
chemistry are 


—Bi(e—aj7 


wile) = 2- e 


a 1 
walg) = x a2 Fa zi 


4 See Zernike, F., “ Handbuch der Physik,” Vol. III, J. Springer, 1928, p. 448. 
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a, Show that, for wy, 2 = a and o? = 1/2h*. 

b. Show that the r-th moment of wi is 1-3-5--++ (r — 1)/27h" if r is even; for 
odd r it is zero. (Takea = 0). All moments of wy are finite. 

c. Show that, for wa all odd moments are zero and no even moment (except the 
zero-th) exists. 


12.4. Special Distributions.—A problem which is basic in statistical 
mechanics and in the theory of errors will here be discussed in some detail. 
It is of considerable historical interest, its solution being connected with the 
names of Newton, Bernoulli, Laplace, Poisson and Gauss. Consider n 
boxes, each containing P black balls and Q white balls. We wish to find 
the probability w,(m), that in drawing one ball from each of the n boxes, 
m of them will be white. 

The probability of drawing a black ball from a given box is clearly 
P/(P + Q) = p, that of drawing a white ball is Q/(P + Q) =g. Thus 
wy (0) =p, widl) =g. If n = 2, the probability aggregate has the 
following properties: bb, bw, wb, ww (b = black, w = white), and these 
occur with the probabilities p°, pg, pg, q°; hence wa(0) = p°, wa(l) = 
2pg, wo(2) = q. In general, the probability that m white balls will be 
drawn from n specified boxes and n — m black ones from the remaining 
boxes will be 


gp ™ 
But. in view of eq. (2) there are C) ways of selecting m boxes from a 


total number of n hoxes. Hence the answer to the problem, first found by 
Newton, is 


W_(M) = (") omg (12-13) 
It is clear from (6) that. 


E wm) =1 
0 


m= 


since g + p = 1. Eq. (13) has of course a more general significance than 
the one here particularized: it represents the probability of m successes in 
n independent trials if the probability of success in a single trial is q. 

To calculate the mean of m and the dispersion of the arithmetical dis- 
tribution w, (m) we consider the identity 


n 
(p + gy)” = E Wa(m)y™ 
m=O 
where y is a variable. On differentiation with respect to y this reads 


n(p + gy) g = Xray (12-14) 
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When we let y = 1 in this equation, the right hand side becomes ™, so that 
m= ng (12-15) 


The mean number of successes is equal to the probability of success in a 
single trial, multiplied by the number of trials. 

To find the dispersion, we differentiate (14) once more, and then set 
y= 1. The result is: 


n(n ~ i)e = Emn — 1)u,(m) = m — 7 


To obtain the dispersion we must add to the right hand side the quantity 
=i — M2 which, according to (15), equals ng — n7q”. Hence 


a? = mè — mM? = ng(l — q) = ngp (12-16) 


Especially interesting is the case where q < p, so that p~ 1. For then 
the dispersion is numerically equal to the mean number of successes, a cri- 
terion which can sometimes be used to determine whether the suecesses are 
due entirely to chance. For applications of the formulas here developed, 
particularly to the case of radioactive emission, the reader is referred to 
Lindsay’s Physical Statistics. (See also the problem of the random walk 
at the end of this section.) 

For large values of and m expression (13) is difficult to use because of 
the inconvenience in dealing with factorials of large numbers. We shall 
now prove that in this case Wn (m) can be approximated by the Gauss error 
law. Let us first see what happens to wr (m) asn—> œ. It is clear from 
(15) and (16) that both 77 and o? tend to infinity, that is to say, if we were 
to plot w,(m) against m, the mean (which for sufficiently large m is also 
the maximum of w,(m)) would move outward from the origin and the 
distribution would broaden out indefinitely. However, the quantity 2, 
defined as the deviation from the mean and measured on a proper scale 
which contracts as n increases, namely 


m— m 


Vn 


will remain finite. We shall try to convert wn (m) into w(x), assuming that 
n —> ©, 
First compute 


In w,(m) = Inna! —Inm!—n (n= m)!+ n- m) inp + ming 


c= 


(12-17) 


Now by Stirling’s formula, which is valid for large numbers, 


1 
Inni= (n+ 9) nn—-—n+ 3 ln Qn + jon (terms of order n=) 
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Hence? 
. 2am(n — m) m 
-i . — — 
lim In wn (m) = In z tmnt (n — m) In np 
(12--18) 


In view of (17) and (15) 


(1+ 7 ) m=n ( z ) 
m =n -r h) n-m= - -r 
q ng p PTEN 


When these expressions are introduced in (18) there results 


lz? 12? 
—|] = ijn? LZ 22 
n w(e) = 3 ln 2mp to ta 


provided we use the expansion of the logarithm 
x? 
In (1 + 2) er ae 


and retain no terms in negative powers of n. Thus we have, since 
p+g=1, 


2 


—In w(x) = 4 In 2rnpg + Z 


2pq 
1 4 
whence wlz) = eee PH (12-19) 
V 2rnpa 
When written again in terms of m it is 
m i — 1 ey 

, lim Wn (m) = A ene)? 2pm = — e7 (mF)? / 20? (12-20) 

nye 2rpm 2ro 


These results have a special significance with respect to errors of 
measurement, as can be seen from the following (oversimplified) argument. 
Suppose that the true value of a measured quantity is A, but that there 
are n causes of error, each of which will add to A the amount AA or — AA 
with equal probability. If m of these n causes contribute AA then the 
resulting error is rAA = |m — (n — m)]AA, and therefore the probability 
of this error is w,(m) with m = (n + r)/2. For large n the distribution 
of errors is then given by (20): 

1 
Ve 
5 In arriving at this result, it is convenient to add to the literal expansion of the 


logarithm by Stirling’s formula the quantity (min n — mln n). 
€ Note however, that g? is no longer the dispersion with respect to the r-distribution. 


Furthermore, f w{rjdr 7.1. 


ge 71/802 


wir) = 
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ITp=q= 4 m = n/2and7 = 0. If we denote rAA by er, Gauss’ error 
law 

w(er) = const. eM 
immediately results; the constant h, which depends on AA and is always 
’ determined empirically, is often called the " measure or index of precision.” 

In the analysis leading to (19) quantities of the order 1/ng and 1/npq 
were neglected, the assumption being that p and g are numbers not greatly 
different from unity. Under these conditions the mean of m, nq, is a large 
number. This, then, is a criterion for the applicability of eq. (19). 

It may happen, however, that g is small, so small indeed that ng is of 
order unity in a given application. In this case the distribution (13) has, 
to be sure, spread out indefinitely (n > co) but the mean has remained 
small; the resulting distribution is quite asymmetrical. To deal with this 


situation we put 


and treat a = 7% as a number of order unity while n tends to infinity. Thus 


EET D] 

Wwn(m) m! "(h a o 

e t9- 

-(1-5) e. Ty 
n 


As n>, the last fraction takes on the value 1. Hence, under these 
conditions, 


n 


m 


a’ -a 


E 
m! 


(12-21) 


lim wrm) = 
ma 


lim ( + 3) =e 
n>a n 


Formula (21) was first derived by Poisson and bears his name. It is used 
in the theory of radioactivity. 


since 


Problem a. Plot wnim) for q = 4; n =5, 10, 50. Observe the change from an 
asymmetrical to a symmetrical distribution. Compare Poisson’s formula with the 
plot for n = 5. 
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Problem b. “ Random walk.” A person, making steps of length J, is just as likely to 
step forwards as backwards (p = g = 4). Prove that, after taking » steps, he will 
have gone forward a distance r} witb a probability 


n 
G” (: + r) 
2 


12.5. Gibbsian Ensembles.—It is the main purpose of statistical 
mechanics to provide a formalism by means of which the facts of thermo- 
dynamics (cf. Chapter 1) can be deduced. This may be done in several 
different ways, that is, on the basis of several distinct sets of fundamental 
axioms. Two of these stand out for their success and clarity. One, the 
system of Gibbs, is particularly suited to a development of the classical 
laws of thermodynamics, i.e., those relations whose understanding is 
possible without the use of quantum mechanics. Gibbs’ statistical mechan- 
ics will be summarized in this section and the next. The remainder of the 
present chapter will be devoted to the method of Darwin and Fowler with 
the aid of which the subject of quantum statistics is most satisfactorily 
diseussed. 

The central concept of Gibbs’? theory is the ensemble, the meaning of 
which will now be discussed. Statistical mechanics deals with certain 
properties of physical objects, as for instance a given body of gas, or liquid, 
ora solid. Such an object will be called a system, or, more specifically, a 
thermodynamic system. If it has n degrees of freedom, then its complete 
mechanical state can be specified in terms of n generalized coordinates and 
n generalized momenta, a total of 2n numbers. Mathematically, these 2n 
numbers may be said to define a point in a space of 2n dimensions, and this 
space is called the phase space of the system. At any instant of time, the 
system. is represented by one point in its phase space, and in the course of 
time, this point will move, describing a certain trajectory in phase space. 
When the position of the representative point at any instant is known, it is 
theoretically possible by the laws of mechanics to caleulate its position at 
any other time, but such a prediction is practically not feasible. Other, 
legs detailed methods of description must be chosen. 

In the simplest instance of a thermodynamic system, the ideal gas con- 
sisting of v molecules, n = 3y, and the phase space has 6v dimensions. A 
representative point would correspond to an exact assignment of 3 com- 


Show also that 7 = © r2 =n. 


7 There is no more lucid and careful exposition of J. W. Gibbs’ ideas than his own, 
“Elementary Principles of Statistical Mechanics”; C. Scribner’s Sans, 1902; 
Collected Works, vol. II, Longmans, Green & Co., 1928, New York. See also 
“Commentary on the Scientific Writings of J. Willard Gibbs,” Yale University Press, 
1936. ` 


443 GIBBSIAN ENSEMBLES 12.5 


ponents of momentum and position to each of the v molecules, and the 
path of the point would portray the changes which the values of all these 
quantities undergo in time. In this case, another picture is often useful. 
One may regard the phase space of the system as being composed of » 
subspaces, one for each molecule. Such a subspace is called a u-space 
(‘‘ molecule space ”) in order to distinguish it from the entire phase space 
which is often designated as y-space (“ gas space”). In the case of mole- 
cules regarded as mass points, u-space has 6 dimensions, although in general 
for molecules having internal degrees of freedom the number of dimensions 
is greater. Use of the »-space is often very convenient, but it loses its 
significance except as an approximate description when strong interaction 
exists between the molecules. 

We shall denote the n generalized coordinates of our system by 
qıq2''' Qn, the generalized momenta by pi-+:+ Pn Out of these we 
construct an element dé of phase space in which the p’s and q’s are taken as 
Cartesian coordinates: 


dp = dgidqo: +: dandpıdpo - ++ dpn 


It is possible to show? that any point transformation 
gi = g(t Qn) 
pt = pi (qi -t> Gus Bit? + Pn) 
leaves dé invariant; thus 
do = dqi ++ dgdpt -> : dp, 


Since the system is assumed to obey mechanical laws, Hamilton’s canonical 
equations must be valid (ef. 9-13): 


a=) “a= z=) t= 1,2,---n (12-22) 


From these equations it follows at once that through every point in phase 
space there passes but one trajectory; for when every p and q is given, 
equations (22) determine uniquely the rate of change of every coordinate in 
phase space. Hence the representative point can never cross its previous 
path. Whether the motion of the point will ultimately carry it through all 
regions of phase space has not been proved completely; such behavior is 
tentatively asserted by the so-called ergodic hypothesis? which, however, is 
not needed in Gibbs’ formulation of statistical theory. 

It would seem that the values of thermodynamic quantities such as 
temperature, pressure, etc., could be regarded as time averages over the 


8 See Gibbs, loc. eit. or Lindsay, loc. cit. 
3 See Tolman, loc. cit. 
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motion of the representative point of the system in its own phase space. 
The development of this conjecture, however, is fraught with rather for- 
midable difficulties and is not usually attempted. Instead, Gibbs intro- 
duces what he calls an ensemble of systems, by which is meant a very 
large set of imagined replicas of the one real system under consideration. 
These systems are not in identical states, but the state of each is repre- 
sented by a phase point in its own phase space. Since all imaginary 
systems of the ensemble are similar as to number of molecular constituents 
and Hamiltonian function, all points can be plotted in the same phase space, 
in which they will be distributed with a certain density, D. 

This density will in general be different in different parts of phase space, 
and it will change in time. Hence 


D = Dp: ++ Prigi +++ Qat) 


Nothing has ag yet been said about the initial distribution of points in 
phase space which, in view of the meaning of the ensemble, is quite arbi- 
trary. Whatever the functional form of D, we must require 


J Dae = N for every t 


if N is the (very large) number of systems in the ensemble. It is conveni- 
ent also to introduce a “‘ probability of phase ” 


D 
Pas (12-23) 


f Pas =1 


12.6. Ensembles and Thermodynamics.—By virtue of Liouville’s 
theorem, proved in almost all books on statistical mechanics (also known 
as the principle of conservation of density-in-phase, a name due to Gibbs), 
the representative points move in phase space as though they constituted 
an incompressible fluid of varying density. A group of points filling a 
certain region of phase space at a time to can neither contract nor expand 
during its motion; it will continue to occupy the same volume but with 
altered shape. Mathematically these statements are expressed as follows: 

n aD . n 


ee Di + zs (12—24) 


such that 


Thus it is seen that phase space possesses no intrinsic property of accumu- 
lating phase points in some regions or not admitting them to others; 
Liouville’s theorem shows phase space to be indifferent to the motion of the 
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points. This fact suggests the following fundamental postulate, by means 
of which contact is established between an ensemble and thermodynamic 
experience: 

The probability that, at any instant J, a given real system be found in 
the state characterized by gı -+-@n, pı * + * Pnis the same as the probability 
P(py-++ Prigi +++ gnit) that a system selected at random from the corre- 
sponding ensemble shall have the phase q1 +++ Qn, P1 ` * ° Pa at the instant ż. 
The probability that the values of the p's and g's shall lie within a small 
extension of phase Ad is proportional to Ag. We are thus attributing equal 
intrinsic probabilities to equal volumes of phase space, a procedure sug- 
gested, though not made necessary, by Liouville’s principle. 

In accordance with this postulate we may calculate mean values of 
dynamical quantities of the real system by computing mean values over 
the individuals composing the ensemble. If R is such a quantity, expressi- 
ble as a function of momenta and coordinates, then 


R(t) = fre mig Qa) P (Pi te Parigi + datde (12-25) 


And by R(t) is meant in general the expected mean value of the quantity R 
which would be obtained when R is actually measured at the time t. It 
can be shown that deviations from this expected mean are extremely small 
when the system in question has many degrees of freedom, so that the 
expected mean may be identified for practical purposes with the value of R 
actually measured in a single observation. Moreover, we shall see at once 
that under equiliium conditions P is not a function of t, so that R, also, 
will not, be a function of £ One may then think of R as the mean value 


-= 


Rdt 
of the quantity 2 in a temporal sense, ie, k = f F for sufficiently 
0 


large T, without violating the spirit of the postulate. 

If the thermodynamic system is in equilibrium, the number of repre- 
sentative points in any given extension in phase, Aq, must remain constant 
in time. The condition of equilibrium may therefore be stated in the form 


a 
aD o 
at 
For the equilibrium case, in which we are chiefly interested (a reversible 
thermodynamic change consists of a sequence of equilibrium states), 
Liouville’s theorem states 
“/aD,. ôD, 
2 (= Bits i) = 0 (12-26) 
i21 \ dpi ðq: 
Let us now give thought to the initial form of the function 
D(pr+ ++ Prigi’ ** Gn). We know that if it satisfies eq. (26), then it 
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implies D(t) = O and corresponds to an equilibrium condition of the ther- 
modynamic system. Hence D will forever be independent of t But we 
note that if we put D = D(H), where H is the Hamiltonian function of 
the system, D will certainly satisfy eq. (26); for the left-hand side of that 


equation will read 
Dz foH . ôl, 
sabes Bi + oa: i) | 


and this vanishes because of (22). Hence we take D = D(H). 
Further restrictions on D cannot be imposed on the basis of mechanical 
or statistical reasoning, except for the obvious facts that D must be every- 


where positive and must satisfy f Dde = N. However, the choice of the 


function must be such as to lead to the thermodynamic formulas when 
thermodynamic quantities are computed by eq. (25). The important 
choices by which this success can be achieved, as Gibbs has shown, are 
these: 
D(H) = const. when Fy S H < Ey + AE 
D(H) = 0 for all other values of H 


D(H) = Ce ™’ (12-28) 


(12-27 ) 


where C and 6 are positive constants. The first is called the microcanont- 
cal or energy shell ensemble, the second the canonical ensemble. The energy 
shell ensemble scems most reasonable from the physical point of view, for & 
system in equilibrium is one of fixed total energy, ie., fixed within an 
interval of error AH, and systems not having an energy within this range 
are excluded from consideration. However, the canonical ensemble, 
although it assigns a finite density to points corresponding to those memi- 
bers of the ensemble which do not satisfy the requirement of constant 
energy, also leads to the correct thermodynamic relations. Since it is 
mathematically easier to handle, it enjoys greater popularity than the 
former, and was indeed preferred by Gibbs. 

The connection between the two types of ensemble may be exhibited in 
the following way. Consider a gas whose phase density in y-space is 
represented by a microcanonical ensemble. Let it consist of molecules 
with p-spaces, pı, ve, etc., with probability distribution P; in space uz. 
Denote the element of extension in m by dọ; Since energy exchanges may 
take place between the molecules, P; cannot be represented by a micro- 
canonical distribution; it must indeed be finite for all energies, H; of the 
i-th molecule. Nevertheless, the probability that molecule 1 be within 
the element dey of its u-space, molecule 2 within dd of its u-space, ete., 
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simultaneously, equals the probability that the whole gas be in the element 
do = didga +++ do, of y-space. Hence 

P,(H,)de1P2(H2)dee + ++ P,(H,)dd, = P(A )de (12-29) 
so that 

P (Ho) - Po(H2) +--+ P,(H,) = P(H#i + He +--+ H) 

We wish this functional equation to be satisfied for every value of the total 
energy H = SCH; although, of course, for any given H the constant P(H) 
may be described by the microcanonical distribution. The solution of 


eq. (29) therefore leads to a very natural extension of this distribution. 
Eq. (29) holds for every v». If the gas consisted of only 2 molecules, 


Pi (Hı) : Po(H2) = P(H, + Hoa) 


Hence it follows that P;(0) = 1 for every i. If we denote log P; by fi, we 


have 
A) + fo(He) = JH + He) (12-30) 


On putting H2 = 0, this reads 
fi(Ha) + fo(0) = Gh) 


and since f,(0) = 0, fı =f. Thus all f; are seen to be the same function, f. 
We are thus led to consider the equation 


f@) +I) =E ta) 
When y is taken equal to x, we have 2f(z) =f (2%), and so by induction, 
Jins) = nf(z) 


for every integer n. From this relation, 


POJO. seme -e 


(pe) RO 


where m is another integer. Finally, 
f(z) = f(x- 1) = af(1) = const. x 


We have shown that the only function which satisfies eq. (30) is f; = cH;, 


whence 
P i (H £) = eh 


But P;, being a probability, must remain finite for every H;, a quantity 
which may tend to +, though not to ~«. Hence c is a negative con- 
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stant. Following Gibbs, we write for it —1/6, so that finally 
P,(H;) = eh? (12-31) 


This defines the canonical ensemble, in accordance with eq. (28). The 
constant C in (28) has been introduced to insure normalization in y-space: 


f Pdo = 1; the functions P, in (29) are not properly normalized in each 
p-space, as is evident on closer inspection. 


Problem. Consider as system a single particle of mass m in a constant gravitational 
field. Note that the microcanonical ensemble is given by Fig. 1, where all points not 
lying between the two parabolas A and B, corresponding to H = Eo and H = 
Ey + AE, have zero density. Show that the group of points lying between p, and 
peatt = 0, will lie between pi and ps at time L such that pi = pi + mgt, pa = pe + mgt. 
Prove also the invariance of the element of phase volume, i.e, area pı = area do. 
(Liouville’s theorem.) 


Fre. 12-1 


12.7. Further Considerations Regarding the Canonical Ensemble.— 
As an illustration regarding the use of the canonical ensemble we derive 
the Maxwell law for the distribution of velocities in an ideal gas. In 


4A9 CONSIDERATIONS REGARDING THE CANONICAL ENSEMBLE 12.7 
accordance with eq. (28) we put 
Pdp = Ce"! ®dpidpa - » - dpndgqidge ++ + dqa 
= Ce M8 H dpizdpiylp:dzidy dz; 
i=l 


The constant C must be so chosen as to make f Pd = 1. For an ideal 
gas, 


+ Peye) | = EH: 


‘ial 


H = x 


i=l 


[> + Diy + Diez 
2m 


where V (x,y,z) is the potential energy of a particle in an external field if 
such a field is present. The probability that particle 7 have an energy H; 
corresponding to Pis, Din Piz} Ti Yi Zo regardless of the states of all 


other particles is clearly given by the integral f Pdo extended over the 


momenta and positions of all particles except d: 
Pi dpisdpad pizde idyidzi = cede, (12-32) 


c’ being some other constant. : This relation, often called the Maxwell- 
Boltzmann law, is really nothing more than eq. (31). When no external. 
field V is present, it may be written in more explicit form, for the constant 
ce’ can then be determined. Since 


ee tee = 1. 


1 
a= -SSSISS okinn, dp,lp.dzdydz 
= -fff =p}? pban dpadoydpe 


where r is the volume of the gas. Thus 


1 “ ant a 
77 df mia | = r- (2m6)? . 73/8 


When (82) is now expressed i in terms ‘of velocities instead of momenta and 
an integration over the volume i is carried, out on both sides, the result i is n 


, P (Un DyD2)dvadd dt: = (Ramy! e -o tadi Cmos yd(mv,) - dime) 


3/2 
_ (£) -08t Hmo dodo, (12-33) 
\ 278 oe 


we have 
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To find the meaning of the parameter @ we compute the mean energy of 
the i-th particle, H:, which, as we know from simple statistical theory, 
must be equal to $kT. We have 


Mins fff A+ ode coh apd ap, 


reme fff T 2 4a 4 wre dudedw 
(2m) 966%! . 33/2 = 36 
If this is to be equal to $47, we must put 
6=kT (12-84) 
Making this substitution in (88) we obtain Maxwell's law for the dis- 


tribution of velocities 


3/2 

P (v2,0y,0z)dv,dv,dv, = (=) g Catti ET hy Ad yl 2 (12-35) 
2rkT 

The probability that the absolute value of v shall lie between v and v + dv 

is derived from this expression by transforming the “ volume ” element 

dv,dvdv, to spherical coordinates, where it takes the form v*dv sin édédy, 

and then integrating over 9 andy. Thus 


|] 


P(v)dv = 4 ( = yee ay (12-36) 
™ \QakT 

According to its derivation P(v) denotes the probability that one 
molecule shall have a speed about v. It is then clear that vP (v) represents 
the number of molecules having this speed. It is this last interpretation 
which is usually given to Maxwell’s law. 

For most purposes it is convenient to write the canonical distribution 
law, eq. (28), in a slightly different form. When we put C, the positive 
constant occurring in that equation, equal to Ne”/?, where y is a new param- 
eter depending on 8, we have 


P(H) = e (12-37) 
the standard form used by Gibbs. 

In conclusion, let us attempt to correlate the quantities H, y, and 0 
with thermodynamic quantities. This can be done through the thermo- 
dynamic relations, the most important of which are: 

dU = TdS — fade; (12-38) 
t 


and 
—dA = SdT + Efidt: (12-39) 
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Here U stands for the internal energy of the system, S for its entropy, A 
for the Helmholtz free energy. The force f; is defined by f; = -- 0H/d6,, it 
is called into play when the i-th external coordinate is changed. The sum- 
mation $ f:;df; represents, therefore, the total work done by the thermo- 
dynamic system when it undergoes a (reversible) change involving varia- 
tion of the ti 

We now return to the ensemble whose distribution is given by (37). 
This distribution will change in detail as the condition of the system 
changes. Butit will change in such a way that 


foray = 


On differentiating this relation (permitting the external parameters y as 
well as @ and hence y to be altered) we have 


d fera = fem” ki — t u — Ly êH a, Je 


6°; Of; 
d nË - 
-o ae ao += Su = 0 (12-40) 


provided we use eq. (37) and indicate averages over the ensemble by hori- 


zontal bars, i.e., Q = f PQdo. The last equation may be written 
—dy = —in P dd + Xfidt: (12-41) 
— — H 
But since In P = =" , we also have 


y=0-InP+H (12-42) 
dy = dH + 6d(inP) + InP d 


When this is substituted in (41), the result is 
dH = —8 a(n P) — E fidt; (12-43) 


Now it is clear that dË must be identified with the increase in total energy 
of the thermodynamic system, dU. We have already established the re- 
lation @ = kT. Furthermore, the f: can hardly be anything other than 
the actual forces acting on the real system, We then see that (48) is the 
exact analogue of the thermodynamic relation (38), provided we interpret, 
— in Ë as entropy divided by Boltzmann’s constant, k. 

When eq. (41) is now compared with (39), } is at once seen to be the 
Helmholtz free energy, A. With this additional interpretation, eq. (42) 
becomes the familiar 

A=U-TS 
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Further pursuit of this matter shows that all thermodynamic relations are 
satisfied if we correlate thermodynamic with statistical quantities as 
follows: ` 

Total energy (U) corresponds to H l 

8 

Absolute temp. (T) corresponds to 8/k (12-44) 

Entropy (8) corresponds to —k In P 

Helmholtz free energy (A) corresponds to y 


When A is given, many thermodynamic properties of the system at hand 
are known (ef. Chapter 1). It is therefore important to know how to com- 


pute the free energy, ie., y, statistically. Since f e/g, = 1, we 


have e” = fu. The integral feta = 7, which is thus 
seen to be basic in the evaluation of y, is often called the phase integral 
(also “ sum of state ” and “ partition function ”’). In terms of it 

y= -kT Inf 


Problem. ‘Using (36), verify the following relations for the first and second 
moments of the velocity distribution of an ideal gas: 


_ 2 zN 
i9 = V; m 
= 3kT 
u = — 
mM, 


Show also that the most probable velocity is OkT/m)!?. 


12.8. The Method of Darwin and Fowler.—A statistical method differ- 
ent from that of Gibbs but also leading to the correct thermodynamic 
laws, and more adaptable to the needs of quantum mechanics, has been 
introduced by Darwin and Fowler.” We shall first describe its funda- 
mental features and then use it to derive quantum mechanical distribu- 
tion laws. Consider a system made up of v similar particles. No refer- 
ence to an ensemble will here be made; all arguments concern this single, 
real system. . If the particles are independent, as will now be assumed, each 
individual particle may: be said to be in a definite energy state en this e 
being an eigenvalue of the Schrödinger equation (see Chapter.11) for the 
single particle, with boundary conditions corresponding to the volume of 
the total system if the latter is a fluid, or other suitable conditions if it is a 

10 Phil, Mag. 44° 450, 823 (1922); 48, 1, 497 (1923). For a géneral and more 
recent treatment see Fowler, R. H., " Statistical Mechanics,” Second Edition, Cambridge 
University Press, 1936. i 
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crystal. By the state of ihe total system we mean the aggregate of single 
particle states, that is the assignment of individual particles to the various 
energies e. Buta microscopic assignment of particles to energies, as in the 
statement: particle 1 has energy e particle 2 has energy €z, particle 3 has 
energy e, etc., has no meaning in quantum mechanics in view of the exclus- 
ion principle. The best that can be done in specifying a state is therefore 
to say that a; particles have energy «1, a particles have energy ez `t ai 
particles have energy «, ete. Thus a state is defined when a system of 
“ occupation numbers,” ay, Gz, +++ as is given. These must obviously 
satisfy the relation s be, 
Za; =P , (12-45) 
i 


We shall also prescribe that the system shall have a fixed total energy FE, 
BO that. ao . 
Lee =F. o .. (12-46) 


Now it is possible, as will be shown in the next section, to assign a 
statistical weight, w(a,---@,) to each state a; +++ Gs. The average of a 
quantity Q(a; `- * da) which takes on different values for different states is 
then defined by : E 
Lw(a mt as) Q (a ots ae) 

Lwla + Qe) 


The summations in this expression are understood to be taken over all 
values of a1, da, Q3, eté., which satisfy conditions (45) and (46); the index s 
‘is in general very large; it is given by é = F, tı > E. Contact with 
‘experience is made in the Darwin-Fowler theory by assuming that Q is the 
observed value of the quantity Q when a measurement is made on the 
system. The Gibbsian ensemble average is here replaced by an average 
over the states of a single thermodynamic system. The fact that they 
agree is rather noteworthy from a Jogical point of view. To carry through 
the calculation of an average like (47) it is necessary (a) to. construct a 
, suitable weighting function, w; (b) to devise means for evaluating the 

restricted sums appearing in that equation. ; E 

12.9, Quantum Mechanical Distribution Laws.—In quantum mechan- 
ics, the weight of an‘energy state is défined as its degree of degeneracy: it is 
equal to the number of linearly independent state functions belonging to 
the eigenstate in question. This postulate will. here be invoked. Our 
system, however, is one, containing v similar particles; hence it is necessary 
to apply all the considerations, of secs. 11.32 and 11.33, in particular the 
exacting demands of the exclusion principle. ‘But for the moment it seems 
_ well to consider the number of eigenstates of Æ belonging to the statistical 
state (a) --- aa) when the exclusion principle is left out of account. 


(12-47) 
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As an example, consider the simple case of 5 particles, with energy 
partition a; = 2 and ag = 3. Let the single-particle function belonging 
to the single-particle energy e, be y1, that belonging to «2, Ya. The Schré- 
dinger equation for the 5 particles corresponding to E = 2e + 3e9 will 
then be satisfied by the simple product 


¥1 (1) i 22 (32 (4) (5) (12-48) 


as well as by any function obtained from this product through permutation 
of the arguments 1 to 5 (each numeral designates all coordinates, including 
the spin, of the corresponding particle). But not all the 5! products thus 
obtained are independent. For instance a transposition of 1 and 2, or a 
permutation among particles 3, 4 and 5 causes no change in the function. 
The number of different combinations is obviously equal to the number of 
ways in which 5 objects can be arranged in 2 piles, one containing 2 the 
other 3 objects. This, according to eq. (1), is 


5! 
213! 


The generalization of this result is immediate; the number of different 
energy eigenfunctions belonging to (ayas -- + a,) is given by 


v! 


w(a + + da) (12-49) 


~ a; laa! noe a,! 

This is true so long as each individual particle function, Y; is non- 
degenerate. Suppose now that the energy e; itself can be realized by g; 
different functions. Each y; then has a weight g,, and the product of a; 
such functions has a weight gi*. We then obtain, in place of (49), the more 
general result 


__, gige +++ gat p 
w(a Ge) =l ael a lag! al (12-50) 


To see this in detail, let us return to the example (48) and assume g) = 3, 
ga = 2. Itis then necessary to introduce new functions, e.g., b, c, din place 
of Yı; e and f in place of Ya. Instead of ¥i(1)y1 (2) we can now have 
b(1)b(2), e(1)c(2), d(1)d(2), b(A)e(2), Del), b(1)d(2), 
B(2)d(1), c(1)d(2), ¢(2)d(1) 


and in place of $2(3)y2(4)p2(5) 
e(3)e(4)e(5), SOSA), eE), e(4)F(3)F(5), eGA), 
e(3)e(4)f(5), e(8)e(5)f(4), e(4)e(5)f(3) 
Eq. (50) would be the statistical weight of the state a; - -- a, if no symmetry 
requirements, no Pauli principle, had to be respected. 
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How many different functions can be constructed out of the individual 
Y; when overall antisymmetry is demanded? We have seen that the only 
antisymmetric function available has the determinantal form eq. (11-160"). 
"This, however, vanishes when any two particles are described by the same 
W, for then two rows of the determinant are equal. Hence, if there is no 
clegeneracy in the individual particle functions (all g; are 1), only one fune- 
tion is constructible; in other words, 


waa) =|Sfawe et, 


Oif any a; > 1 


On the other hand, if the i-th state has a degeneracy g: > 1, the num- 
ber of non-vanishing determinants which can be constructed is equal to the 
number of ways in which a; arguments can be distributed among g; differ- 
ent functions, and this, by virtue of eq. (2) is (* Thus we obtain in 
i 
general, when the exclusion principle is applied, 


a CEE) 08 


Note that this vanishes when any a; is greater than its corresponding gr, 80 
that the preceding equation is a special case of this. 

Finally, we consider the case in which the total function is symmetrical. 
As was shown in sec. 11.33, this is of the form Z Yr, where Wp is a function 


constructed like (48), with a particular permutation of the arguments 
1toy. But if any permutation of arguments is made in }Ł Yp, this fuac- 
P 


tion is transformed into itself. Hence w(a --- as) is always 1 provided all 
gare 1. But of this is not true, then the degeneracy of e; gives rise to as many 
different combinations of functions phy2 ...y as there are ways of 
distributing the a; arguments amongst them, without regard to the number of 
argumenis associated with the same function. This number, by eq. (4), is 
(* +ar— 1 


a ). We have thus determined w for the symmetrical case 
i 


w(a1 +++ aa) = (" re T Detz ~ 5) (12-52) 


Assemblies of particles whose motion is governed by the Pauli principle 
must be described by antisymmetric functions. Their statistical weights 
are given by (51). The formulas which ensue from its use are characteris- 
tic of Fermi-Dirac statistics, the type of statistics to which electrons, 
neutrons, protons are subject. Henceforth we refer to (51) as Wp.p. On 


to be 
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the other hand, photons, nuclei and atoms containing an even number of 
elementary particles (e.g., Het) are known to require symmetrical state 
functions for their collective description. Their statistical states have 
weights given by (52); they are said to obey Hinstein-Bose statistics. 
Henceforth we write wz, in place of (52). .No known constituents of 
material bodies are described by (50), although it is precisely that assign- 
ment of weights which leads to the Maxwell-Boltzmann law. It will be 
shown, however, that both quantum formulations, (51) and (52), give rise 
to distribution laws which under many thermodynamie circumstances are 
practically identical with the Maxwell-Boltazmann law. For this reason, 
and for the sake of generality, we continue to include eq: (50) in our con- 
sideration, and refer to it as w,"* (classical assignment. of statistical weights). 

It is to be noted that all three statistical weights may be written in the 
form 


w= Hy (a;) 
if we put 
`) ti 
rela) = Er 

7 

Ypp,(a;) = (“) (12-58) 
3 

yzg. (aj) = (" ta 7 ') 

J 


In proceeding thus the factor v! in eq. (50) is being omitted. However, 
since this factor is independent of the a’s and hence constant for all statisti- 
cal states, it will cancel when averages are computed after the manner: of 
eq. (47). This is the principal use which will here be made of the weight 
- function. In many other problems the omission of r! is not, permissible. 
(See the remarks after eq, 72.) l 
The quantity Q whose average we wish to calculate is.a,, the number of 
particles having a given energy e. It is necessary, therefore, to evaluate. 


La, Uya) A 
@ i - 2 
2il (a) W 


ür = 


(12-54) 


W is here written for the sum of all statistical weights compatible with 
the fixed energy E, A for &W. The summations, appearing in (54) are 


4 The formula for w, can also be derived ag follows, without reference to state 
functions. Divide phase space into cells accoramg ‘to the energies of the individual 
particles: in the i-th cell a particle has energy ez If the wth cell has i fundamental 
weight gi; then the number of ways in which the state ‘ay, dz” + * a cant be realized iy 
assignments of specific particles to cells is given. by.ég. (50): 
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taken over all values of a1, Q2, @g +++ Gs which satisfy both (45) and (46). 
We first calculate W. 
For this purpose consider the expansion 


m= E [Biene r=] (12-55) 
Q03 ++ As J 
in which the summation over the a’s is entirely unrestricted, each one of 
the many a’s taking all values from 0 to %. M may be regarded as a func- 
tion of x and z, depending parametrically upon the eigenvalues e; character- 
istic of the particles in question. "A moment’s reflection will show that 
W is the coefficient of v’z® in the expansion M; in other words, because of 
the theorem of residues eq. (3-3), 


iy Madzdz ; 
w= (5) ff au (12-56) 
the integrals being taken counter-clockwise about the poles of the inte- 


grand, ie., about z = 0 and z = 0, z and z being considered as complex 


variables. 
Now M may be evaluated rather simply. First note that it can be 


written 


M= E Hfr] 


= uÈ venjarar} 
i in=0 


‘The summation in { } can. be performed for all three of the functions y 


listed in’ (53). ‘Let us put 229 = r; Dyn)” = fr). We then obtain 
n=O, 


f= x on} = exp (gj7) = exp, {grz} 
Srv. = (À r = (Lr)! = (1 + 22%)” (12-57) 
a . — 1 ; 
Jes. = i +” ') r” = (1 — x2” 


The last result, which is perhaps not so obvious, is easily verified by writing 
down the MacLaurin expansion of ' , 
1 -og 1 2 
-r= ee ag 


which is identical with the summation in Jz.B. 
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Thus we see that 
M = Mie“) (12-58) 
3 


and W is related to M by eq. (56). The effect of the degeneracy factors g; 
in (57) is rather interesting: they merely appear as exponents in the f-func- 
tions in all three cases. 

Let us now consider the numerator of eq. (54). If we calculate the 
quantity (1/ln z)(dM/de,), we find, using eq. (55), 


1 ð 
a TI \ypti (pln ajen maul j Oj yh fe 
x (Hra) GE x aba jarn 


In 2 O€p asuvas 
On comparing this with (54) we see that A is the coefficient of x’z” in the 
expansion of (l/Inz)(dM/de,). But this last quantity can be put in a 
form more suitable for our purposes. In view of (58), 


1 oM 1f M afie") og 
as da ' z} 


= Mz oan frz" 
Ox 


Summarizing these steps, we note: 
y ô „„ _Mdede 
A= (5) f frim f(c") FLET (12-59) 


If now it were possible to find a path of integration around the origin 
with respect to both x and z, such that the function Me~ tg 2—1 were 
practically zero everywhere along that path except. in the immediate 
neighborhood of two definite points, say z = Ẹ and z = ẹ, the evaluation 
of a, = A/W would be very simple. For it would then be permissible to 
take the factor x(4/dz) In f, multiplying the integrand in (59), in front 
of the integral sign and give to x and z the values £ and #, and the integrals 
themselves would cancel. Weshould then have 


ð 
a, = E In f(€9") (12-60’) 


This procedure is indeed justified, as the following section will show. If 
the f-functions are identified in accordance with (57), the result is seen to be 


(Gre = §9,0 ér 


- _ grb 
(G+) Fp. = gl 4 pe 
_ „0“ 

(ār)a.B. = on 


gt — ry ad 
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The physical significance of the parameters ¢ and 2 can be fixed most simply 
by the subsequent plausibility argument which we offer in lieu of more 
detailed considerations!” adducible to establish this meaning more com- 
pletely. The first of the expressions above must be the Maxwell-Boltz- 
mann law which reads, for the situation bere considered, A, = Aog "T, 
the factor Ay being so determined that La, =v. Hence £ must be identi- 


. 
fied with Ao, and 8 with e" kT Im the other two relations £ must also act 
as a normalizing factor, while # has the same meaning as in (är)e- Hence 
we conclude 


Gy)e = ggr «t* (a) 
_ gre elke 
Grp. = Piped (b) (12-60) 
gre ler 
(Gea. = pi et (c) 


It is easily seen in a qualitative way that £ must increase when v in- 
creases (the volume of the system being fixed) if Xa, is to remain equal 
tov. In (a), & is in fact proportional to v», but this simple dependence 
fails in (b) and (c). Nevertheless, if v is very small, go> 1> el Re, 
In this case both (b) and (c) reduce to the classical form (a). Hence for 
sufficiently small densities all assemblies show an essentially classical 
behavior. Closer investigation (see any of the references at the beginning 
of this chapter) indicates that this is true for all ordinary molecules at 
ordinary temperatures, thus justifying the use of classical statistics. 
The main instances in which quantum distribution laws are needed are the 
motion of electrons in metals (b), the photon gas, and helium at very low 
temperatures (c). 

All thermodynamic relations can be deduced by the method here 
lescribed provided the following associations between thermodynamic 
quantities and elements appearing in the Darwin-Fowler scheme are 
nade; the first is obvious, the second has already been obtained; the 
sthers will be derived later on (of. eqs. 71 and 72): 


U corresponds to E 
T corresponds to — (k In 8) (12-61) 
S corresponds to k In W 


vA corresponds to —&7'(X In f (eo!) — v In &) 
j 


12.10. The Method of Steepest Descents.—The evaluation of €, in the 
last section depended for its validity on our ability to find a point s = Ein. 


12 Cf, Fowler, loc. cit. 
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the integration with respect to x, and another point z = # in the integration 
with respect to z, at which the integrand of (56) would be large, and 
where its value would descend very steeply on both sides along the path 
of integration. Such a point has interesting properties which we shall 
first investigate in connection with a simpler but more general example. 

Let (z) be a function in the complex plane, z being s + iy. We wish 
to find a point in the X, Y-plane such that, as we cross that point in the 
direction of steepest descent of ¢, p will be a maximum at the point. To 
be more specific we will refer in this inquiry not to y, but to the real part 
of ». Suppose the point thus defined is z = ù. 

On writing 


olz) = glz) + th(ay) (12-62) 


where g and h are real functions, it is clear that these must satisfy the 
Riemann relations: 


de = hy gy = Sh” (12-63) 


Cur specification amounts to this: dg = gad + g,dy shall be zero along 
the path on which g decreases most rapidly. The direction of this path is 
the direction of the negative gradient of g, namely —Vg. Since this is 
— (g= + jg,), this direction is defined by dy/dx = gy/gz. But by reason 
of eq. (63), this is —hz/hy. On the other hand, if dy/dx = i—he/hy, then 


hada + hydy = dh = 0 


in the same direction. Now the vanishing of both dg and dh at 0 is possible 
only if g'(8) = 0. We may conclude, therefore, that the point in question, 
if it exists at all, satisfies the condition 


gl =0 
If y has a real root, the point # will obviously lie on the real axis. | 
_ Next, it will be shown that X is a “saddle point,” i.e., that the curvature 
on the path of steepest descent is opposite to that along a path at right 
angles to this direction. For any direction dy/dx 


: dg = gaada? + Qgeydedy + guydy? 
18 By gz is meant ôg/ðz, etc. To prove these well known relations we observe 
ge =e = get ihs 
oy = tp! = gy + thy 
with y’ = dy/dz. Hence »’ = ge + ihe = —igy + hy which is equivalent to eqs. (63) 


when real and imaginary parts are equated separately. ‘Note also that (63) implies: 
ges = ~h o 
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The direction at right angles to dy/dz is fixed by the substitution of —dy 
for dz, and of dx for dy. Hence, on writing dg, for the curvature at right 
angles to the direction dy/dz, 


dgy = Geely? — Wayledy + gyda” 
so that 
dg + dg = Gee + Guy) (dx? + dy”) =0 


in view of eqs. (63). 
With this general knowledge, let us return to the caleulation of (56): 


W = (riy? $ $ mea ae (12-64) 
M = yo), r= 22% 


Let us further put 


Mr = Xe) 


so that > (12-65, 
Y=Elnfr;)—rmnz-—Elnz 
j 


tl 


A saddle point of the integrand of (64) is then determined by 


in the integration with respect to. z, and by 
oY _ 9 
dz 

in the integration with respect toz. The first of these leads to 


f' (07) v 
Li! 97 —~— =0 2-6 
E Fos) a £ (12-66) 
the second to 
lefe) oE- - 
DT fe) 8 ° (12-67) 


where p; has been written for 8%. Eqs. (66) and (67) define the saddle 
point (£0) in X — Z space: 


t 


ghas =» | (12-66’} 


wank 7 - (42-697) 
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These results at once take on a more interesting form when we insert 
eq. (60°): 
f 
ay = EF of 
The saddle point conditions are then seen to be nothing more than the 
conservation conditions of our problem: 


Lü =» 
į 
Lūse = E 
j 


In classical statistics we may also write 


Lgj“ = v 
J 
ELigj0 Ge; = # 
j 


In quantum statistics, the equations become 


& 


Eurip =p 


AZA =E 


where the positive sign is to be taken in the Fermi-Dirac case and the 
negative sign in the Bose-Einstein case. 

In the following we shall also need the values of d?Y/dx*, 2Y /32?, and 
6*Y/dxdz at the saddle point. To save writing the discussion will be 
limited for the present to F.D., statistics. Here one finds '* 


oy _ > Eb _A 

a, Pate # 

8Y 1 7 Ej y _B 

oF by T ZORO (; + £05) ~ 3 

a’y 1 & €j Cc 
a gjd“ E&\ 2 = 

ðzråz £6 iù j (1 + o i) Ed 


The quantities A, B, and C are to be defined by these relations: it is 
easily seen that, for a gas consisting of many particles, they are very large 


4 Note that the symbol A has a different meaning than in the last section! Neither 
isit to be confused with the Helmholtz free energy. 
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numbers.!5 Moreover, the reader will be able to show that 


AB > C? (12~68) 
always. 

Having located the saddle point, we expand Y (z,2) in its neighborhood. 
ðY oY 

Yiz) = Yia) + @-h +7 7%) (12-69) 
ðE ad 

aY oY aY 
1) ow — £y? o — — —_—— — 2 tae 
s[i e-r t woe + Seo] 


The first derivatives vanish at the saddle point, which, as may be shown 
from eqs. (66) and (67), lies on the real axis of both x and z. In order to 
carry out the integrations in (64), it is suggested that the paths be taken 
across ¢ and #, and this is done with greatest convenience by choosing the 
circuits 


a = te, -r Lla S rj z = pe, -r <br 
When these substitutions are made in (69), this expression becomes 
Y (xz) = Y (0) — 4 (4a? + BB? + 2Co) 
in the neighborhood of $, 2, since for small a and 8 
r— E= ita and z—0 = 106 


Therefore, in view of (65), 
Mrz E = e &*) g7 (1/2 Aat+ BB+ 2Caf) 


and 


W = (Qn)? f f o ED) = (1/2) Aa +BE+208) (ida) (idB) (12-70) 


This result shows with impressive clarity how rapidly the integrand 
“ descends ” from its saddle point: its “ half width ” with respect to a for 
example, is approximately given by A72, But A is of the same order of 
magnitude as v, the total number of particles. The procedure of the fore- 
going section was therefore proper. 

The question arises as to the behavior of the integrand at points on the 
contour not in the neighborhood of the saddle point, for we hardly have 
reason thus far to expect: that it is small everywhere else. This, however, 
is not difficult to prove. When written in terms of a and 8, the function f 


of (57) takes the form 
frp. = (O + E09 e), uj = a + fe; 


15 The « must here be regarded as dimensionless and of order of magnitude unity 
or greater. This can be achieved by measuring kT and « in the same conveniently 
chosen unit, in which case kT and hence J, also, become pure numbers. 
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and M becomes 
. — ; _, 4 sin u; 
M = IEC + 29% cos u + P84)? oxp igs tan io 


The product of all exponential terms, which are purely imaginary, can never 
exceed unity, the value which it has at the saddle point: Each term in the 
first parenthesis attains its maximum when u; = 0, orin general 2nx. If 
uj = 2nr for a given choice of œ and 8 there will be very many uz, k # J, 
for which the parenthesis will not assume its maximum value, so that the 
produet, having a great number of factors, will be much smaller than its 
value at £0. The only way to insure that M will be a maximum is to 
make u; = 0 for all j, and this requires that both a and 8 be zero. : This 
maximum will be very strong provided the number of energy states, ej, 
is large. 

It is of some interest, finally, to conclude the explicit calculation of W. 
The integral in (70) is easily performed, for the limits may clearly be re- 
placed by +o and —«. Remembering the formula 


> _ 
7 mT 

f gariskbe drz = yz etta 
-v a 


eat) l 
aV AB — C2 


which, in view of (68), is real and positive. The entropy, defined in (61), 
becomes therefore 


we find 


S = kY(é,3) — k ln [rV AB — C (12-71) 


In chemistry, it is customary to neglect the second term of S because it is 
much smaller than the first when the number of particles is large.'° Now 


Ytd) =n M(E8) —-vy ne Eln 


= Ein jo) ving + 
j 
In view of (71), then, 
E — ST = —kT(X ln f— vIn €) (12-72) 


This justifies the identification of the free energy made in (61). 

Finally let us endeavor to make contact with classical statistics again. 
Here it must be remembered that a factor v! was omitted in the evaluation 
of W. Wemust therefore add to (71) the quantity k In v!, so that we have 


16 The full expression must be used when attention is giveri to the entropy of a 
nucleus, which contains relatively few neutrons and protons: ` 
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in place of (72) for the classical free energy A dans. 
P+ Ades = ~kT(L Inf —viné + In vf) 
3 
But 


by eq. (66’) and 
Invl = yvlny—y 


by Stirling’s theorem. Hence 
v- Aga = —kT ( In 3) 


Again by (66), »/£ = Xigj?%, a quantity to be denoted by Z and often 
I 
called the partition function. In terms of Z, 
Adas = —kT ln Z 


Comparison with the last equation of sec. 12.7 shows that Z is the quantum 
mechanical analogue of Gibbs’ phase integral.’” 

The computational aspects of the statistical method can be simply 
summarized as follows. Given a system of v particles whose total energy 
is Æ. Each particle has energies e; obtainable by solving the Schrodinger 
equation with boundary conditions corresponding to the volume in which 
the particles are enclosed. For instance, if the volume is a parallelepiped, 
the e; are given by eq. (11-36). Thus they depend on the volume of the 
container. 

The thermodynamic properties of the system then depend on two 
parameters, £ and ð, defined by eqs. (66’) and (67’). When these are 
solved simultaneously and £ and ð are known for the given v and E, the 
quantities T, S and A can be calculated from (61). In F.D. and E.B. 
statistics, eqs. (66’) and (67’) are such that there is no general method for 
obtaining explicit solutions for and 3. Recourse must then be had to 
approximations, valid in different ranges of the parameter p18 


Problem. Find the values of A, B, C in classical and in E.B. statistics. Note that 
the classical values are obtainable from those derived in the preceding section by 


letting £0. 


17 Partition functions for specific substances can often be computed from spectro- 
scopic data. Such calculations are becoming increasingly important in applied thermo- 
dynamics. See Taylor, H. S., and Glasstone, S., ‘‘ A Treatise on Physical Chemistry,” 
Third Edition, Vol. 1, D. Van Nostrand Co., Ine., New York, 1942 

18 See for instance Fowler, loc. cit. 
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CHAPTER 18 
NUMERICAL CALCULATIONS 


13.1. Introduction—We describe here certain types of numerical 
calculations which are often required. No theory! is presented, but the 
methods are explained and illustrated by means of worked examples. 
The reader will find that such computations are usually tedious and time- 
consuming, hence he should exercise his ingenuity in devising means of 
reducing the labor involved. Before starting a calculation, he should 
always consider the possibility of using graphical methods, for these are 
often simpler than the numerical ones. He should also remember that 
there is some advantage in representing numerical data by equations of 
empirical or theoretical form. Such equations, obtained by the method 
of least squares or otherwise (see sec. 13.37) are generally easier to use for 
interpolation, differentiation or integration than the methods of this 
chapter. Finally, he should note that when alternative procedures are 
given for a particular operation, the special problem at hand may often 
suggest which of these is the most suitable. 

It is assumed that the reader is familiar with the elementary facts 
concerning significant figures, rounding off and number of significant 
figures to be retained in addition, multiplication, etc.” 

For convenience, we divide this chapter into three separate parts. 
The first deals with methods primarily based on interpolation formulas; 
the second, with miscellaneous algebraic calculations and the third with a. 
discussion of errors and related problems: 


PART 1, NUMERICAL METHODS BASED ON 
INTERPOLATION FORMULAS 


INTERPOLATION 


13.2. Interpolation for Equal Values of the Argument.—It often hap- 
pens that data are given in tabular form with values of z and y = f(z) at 
certain intervals of z. Suppose a value of y is needed for an z, which is not 


1 See references at end of chapter. 

? Retention of an unnecessary number of significant figures should be carefully 
avoided, especially in physical and chemical calculations. If (n + 1)-digits are carried 
along.in the intermediate stages of the calculations, the final result, obtained by round- 
ing-off and thus containing n significant figures, will be uncertain by one or two units in 
{ts last digit. This practice.is customary in treating scientific data. 
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listed in the table. Usually, the simplest procedure is to plot y against 2, 
draw a smooth curve through the points and read from the graph the 
required value of y. The same result may be obtained by the use of 
interpolation formulas. Provided the given values of x are equidistant, 
we first form a difference table? as shown in Table 1 where the first, second, 
third and r-th differences are given by 

Ayo = 41 — Yo, AY = Yor Yess Yna = Yn — Ynis 

AYn = Ynt1 — Yn 

Ayo = Ayi — Ayo = yo — 2y1 + Yo, ++ 

A*y,, = Ayns — Ayn = Yn42 — Yny + Yn * 

Yn = AYnpi — Ayn = Ynga — BYng2 + BYnpi — Yn 


— — rir 
A Yn = A” Ynys — A” lan = Yor — TY nparont -+ rr — 1) 


2) Yn+r—2 + ue 
+ (— 1Y Yn 
r r 
= £ (—1)™ Ynpr—m (13-1) 
m=Q m 
TABLE 1l 
z y A A? Ag Af ad A8 
To Yo 
zı yı Ayo ’ 
z2 yo Ayı Aĉ°yo 
za ya Aya Aĉ?yi Ayo 
z4 Yi Ays Ayo Ayı Atyo 
z5 Ys Aya A*y3 AP ye ty, Abyo 
a Ye Ays Aya A®ys Atya Ayi Afyo 


In forming such a table of differences, care must be taken to maintain the 
correct signs; the subtractions must all be performed in the order given in 
(1). A convenient check may be obtained by noting that the sum of the 
entries in any column equals the difference between the first and last 
entries in the preceding column. It also happens in most cases that the 
differences of some order will be zero or will vary (perhaps with alternating 
signs) only in the last few figures of the numbers retained. This is the 
basis for all of the methods described in the first part of this chapter, for if 
the unknown f(x) were a polynomial of the n-th degree, the n-th differences 
would be constant and the (n -+ 1)-th differences zero. 


3 Many different notations and forms cf the difference table will be found in books 
on numerical methods but it will usually be simple to find the relations between the 
various symbols used. 
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Now if z and ys are values given in such a table, h is the common 
interval of z, h = xı — £o = g3 — ti = +++ = Zn — Lay, and 
z- 2 
x = a + hu; u= e-a) (13-2) 
then a value of y for an x not contained in the table is given by Newton’s 
interpolation formula, 


y = nt naye HE ay, HEDT gay, gn. 
+ ulu — 1) (u — 2 . (ur +1) ar (13-3) 
A second useful form of this equation may also be obtained: 
y = yr + usya t ute ee Ayo + EET? py gto 
4 ulu + 1)lu + a. .. (u +r — 1) Ny (13-4) 


It will be noticed that (3) involves differences lying on a diagonal line in the 
table, starting from yp, while (4) uses differences on a horizontal line from. 
yx. ‘Thus (3) should be used for interpolation near the beginning of a 
difference table and (4) for interpolation near the end. Summation should 
be continued until the desired number of significant figures is obtained. 
These two formulas may also be used to extrapolate at both ends of the 
difference table but due caution should be used in such cases unless it is 
known that the function is continuous beyond the tabulated values. 

Example 1. Interpolate in Table 2, to find y = ¢™ for z = 0.0477. 
We take a, = 0.05, thus A = 0.05,u = —0.046. Using (3), 


4.6 X 1.046 X 4.85 


y = 0.99750 + 4.6 x 7.45 x 1075 — z 


x 107 


4.6 X 1.046 X 2.046 X 1.9 
6 


= 0.99750 +4- 0.00034 — 0.00012 = 0.99772 


x 10° 


It will be noticed that the third and fourth differences are too small for 
consideration. The result is correct to the last figure given as may be 
found by expanding ¢~* in a power series. In this case, the calculations 


may easily be performed with a slide-rule. 
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TABLE 24 

x y =e” A A? Ae At 
0 1.00000 
0,05 0.99750 — 250 
0.10 0.99005 — 745 —495 
0.15 0.97775 — 1230 — 485 +10 
0.20 0.96079 — 1696 — 466 +19 +9 
0.25 0.93941 —2138 — 442 +24 +5 
0.30 0.91393 — 2548 ~410 +32 +8 


Example 2. Calculate y = e~™ for x = 0.2862. Since this value is 
near the end of the table, it is better to use (4) with z = 0.30, u = —0.276. 
Then, 


4.1 X 2.76 X 7.24 


6 
3 x 10 


y = 0.91393 + 2.548 X 2.76 x 107? + 
3.2 X 2.76 X 7.24 X 1.724 


—6 
6 x 10 


= 0.91393 + 0.00703 + 0.00041 ~ 0.00002 = 0.92135 


This result is also correct to the last significant figure. 

‘An arrangement of tabulated data, somewhat different from that of 
Table 1, leads to central difference formulas, notably those of Stirling and 
Bessel. While these converge faster than Newton’s formulas, this advan- 
tage in most cases is of no practical importance.® 


Problem. Interpolate or extrapolate from the data of Table 2 to find y = ew 
for z = 0.045; 0.2775; 0.3018. 


13.3. Interpolation for Unequal Values of the Argument.—When the 
values of x are given for unequal intervals, (3) and (4) do not apply, but it 
is possible to use divided differences or the interpolation formula of Lagrange. 
Both methods are tedious to apply and not very precise, hence it is usually 
better to interpolate from a suitable graph. We give Lagrange’s formula 
only; for the method of divided differences, Whittaker and Robinson 
(loc. cit.) may be consulted. Suppose zo, 21, -+-+, £n and Yo, Yip *>*, Yn 


4 Following the usual custom, we omit zeros after the decimal point in the various 
differences. 
5 For details concerning central differences, see references cited at end of chapter. 
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are known, then for some other value of x, 


_ _ _(@ — 1) @ — 22) ++: @ — tn) 
y= f(e) (£o — £1) (£o ~ Zz) +++ (£o — £n) 
(© — zo) (@ — z2) +++ (£ — £n) 4. 
(xy — £o) (£1 — Ta) -+ + (21 — Xn) "a 
(a — To) (æ — t1)- (@ — Enl) 
t (Gn — £o) (En — 21) (En — Ena) (13-5) 


Example 3. The following data were obtained in the calibration of a 
platinum-rhodium thermocouple. Find the temperature corresponding to 
a reading of 9.000 millivolts. 


t, °C. 630.5 960.5 1063.0 
e, millivolts 5.535 9.117 10.301 


With z = 9.000, zo = 5.535, zı = 9.117, zə = 10.301, yo = 630.5, yı = 
960.5, yo = 1068.0, 
_ (—0.117) (— 1.301) (630.5) | (8.465) (— 1.301) (960.5) 
(—3.582) (— 4.766) (3.582) (— 1.184) 
(3.465) (—0.117) (1063.0) 


= 950.4°C. 
(4.766) (1.184) c 


The value obtained from a carefully constructed curve is 950.2°C. 

13.4. Inverse Interpolation.—The problem of inverse interpolation, as 
the name implies, is that of finding a value of x corresponding to a given 
value of y = f(x). From Lagrange’s formula, it is seen that the roles of x 
and y may be interchanged so that (5) may be used for inverse interpola- 
tion by rewriting it to give z = ¢(y). An illustration of this application 
of (5) is shown in the following problem. Inverse interpolation may also 
be effected by reversion of the series (3) or (4) to find u as a function of y 
and Ay. The unknown z is then obtained from (2) or by a method of 
successive approximations. Full details of both procedures are given by 
Scarborough (loc. cit.). 


Problem. From the data of Example 3, sec. 13.3, find the electromotive force of 
the thermocouple when the temperature is 750°C. 


13.6. Two-way Interpolation—Suppose the tabulated quantity is 
given as a function of two independent variables, for example, the index 
of refraction of water as it varies with both temperature and wavelength. 
Interpolation to give a value of y for two variables not contained in such 
tables is best performed by using Newton’s formula to interpolate for each 
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variable separately. Series, similar to Newton’s for direct, two-way inter- 
polation, are given by Scarborough (loc. cit.). 


NUMERICAL DIFFERENTIATION 


13.6. Differentiation Using Interpolation Formula.—In order to deter- 
mine the numerical derivative of a function of z at a given point, the slope 
of the curve of the function may be obtained by graphical means or the 
data may be fitted to an empirical equation which is then differentiated. 
We may also write 


dy _ N (2A - 
dz (2 ax (13-6) 
and if we use (2) and (8) we get 
dy  ldy al (2u— 1), (Bu? ~ u +2) 4 ] 
de hdu hL gp Ae t 3! Ade to 
` (13-7) 
At the point « = gp, u = 0, so we have 
d 1 
(x) = Aue — bA ye + gu, — taf +++] 
T/s=n A 
ay Le 3 4 
m = ali Ay + hA yoo (13-8) 
cr 


More terms and higher order derivatives may be readily found. Since the 
lower order differences disappear upon differentiation, the convergence of 
(8) is slower than that of (8) or (4), therefore derivatives obtained in this 
way are not very precise. 

Maxima or minima in a tabulated function may be found by substitut- 
ing the differences in (7), equating the derivative to zero and solving for u 
and then forg from the relation z = x, + hu. 

Example 4. . Find dy/dz and d?y/dx? for y = e™ at the point x = 0.05 
from the data of Table 2. 


i) 1 [ 0.00485 0.00019 200008) 
oo = =| ~0,00745 +? ~ 
dt}. ao95 0.05 ty t 4 
= — 0,09980 
2) 1 l 2.00068) 
“4 = — 0.00485 — 0.00019 
(a z=0.05 (0.05)? + 2 
= — 2.00000 


The values found by differentiation are 
dy/dx = —2xy = —0.099750 
dy/dx? = 2y(2x7 — 1) = — 1.985025 
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13.7. Differentiation Using a Polynomial.—Another method of finding 
the derivative has been described by Rutledge.® It does not depend on 
differences but assumes that the given data can be fitted to a polynomial of 
the fourth or lower degree, Five points must be known, that is, five values 
of x and y. If his the equal interval between successive values of x, the 
derivative of y = f(x) at the point z = zp is given by the three following 
approximately equivalent expressions. 


dy 1 | 
ON = IDA [Byr + 10yr — 18yr + 6yz2 — Yr—al 


1 
= To, [Yin 2 — Yeo) — Byki — Yr) 


= Foy, V+ T Steps + lByrpi — LOye — Byra] (18-9) 
These equations are particularly suitable for solution by one continuous 
operation with a calculating machine. The method may be extended to 
apply to polynomials of degree higher than four or to derivatives of higher 
order. 

Example 6. Find dy/dz at x = 0.15 for y = € ™ using the data of 
Table 2 and the method of this section. 


dy 1 
> = ——— [3 & 0.96079 + 9.7775 — 18 x 0.99005 
(4) a 12 x 0.05 + x + 


6 X 0.99750 — 1] = — 0.02934 


= = 1(0.99760 — 0.98941) — 8(0.99005 — 0.96079)] 
= —0,02933 
= + [0.91393 — 6 XxX 0.93941 + 18 x 0.96079 — 9.7775 — 


3. X 0.99005] = —0.902933 


By direct differentiation, dy/dx = —0.0293325. 
Problem, Use the data of Table 5 to find dy/dz and d*y/dx* at z = 0.75 by the 
methods of secs. 13.6 and 13.7. 


NUMERICAL INTEGRATION 


5 

13.8. Introduction.-Suppose f(z) is known to be continuous over an 
interval of z from a to b but that either the explicit form of f(z) is unknown 
or it ig such a function that its definite integral cannot be determined 


ê Rutledge, G., Phys. Rev. 40, 262 (1932). 
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conveniently in terms of other known functions. Numerical evaluation of 
such integrals, a process called approximate quadratures, depends on replac- 


ing the integral f. f(x)d« py another integral f o(x)dx where (x) can 


be determined in 4 simple way. If f(t) is known to have the (n + 1) 
values Yo, Y1, °°"; Yn at (n + 1) points within the interval (a,b), the latter 
integral may be expressed as 


b 
f b(x)dx = Aoyo + Ary + +--+ AnYn (18-10) 


where the (n + 1) quantities Am are independent of the (n + 1) values 
of the ym. It follows that if f(z) is a polynomial of degree < n, the error 


made in replacing f f(x)dx by LAmym may be made to vanish by the 


proper choice of the Am. If f(x) is a polynomial of degree > n, the differ- 
ence between the true value of the integral and (10) may still be small 
enough to make this procedure useful. We first consider the methods 
where the ym are known at equal intervals. 

13.9. The Euler-Maclaurin Formula.—lIf the explicit form of f(x) is 
known and it has finite derivatives at the upper and lower limits of the 
integral or if these derivatives may be determined by numerical methods, 
the Euler-Maclaurin formula may be used to evaluate the integral. Indi- 
cating the values of f(x) at x = a and atx = b by yo and yn and the inter- 
mediate values by y1, Y2; Y3, ‘+ +, this formula is written 


rt 


b 
_7,|% tel op A D aft 
Jior- Btn tut + =| Zr pm fy — yO] 
(13-11) 


where y,") and yé’ are the r-th derivatives of f(x) at the points b and a. 
The numerical coefficients B, are the Bernoulli numbers, defined by the 
relation 


{r) 


rpe AT (13-12) 


which may be rearranged to give the identity 
œ% g” oo Bye" 
aro Zn =} 
Successive values of B, are obtained from this equation by equating the 
coefficients of equal powers of x to zero or more simply as follows. Expand 
the equation 


(B+ 1)" = B” (13-13) 
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for a given value of n. Now set BY = B, and if By = 1, Bus, Bag, s+ 
are known B,.; may be found. A few of the Bernoulli numbers’ are 
Bı = —4; Ba = Bs=B,=---=0; Bo =}, By = — py, Bo = dy, 
Bg = — yy, Bio = gx. Putting these numbers in (11) and using the 
notation (ater to be clarified) 


Ir = a| Btu tut ota te 


we obtain the Euler-Maclaurin formula in expanded form 


b 2 4 6 
f ioi = Iu = Ir ~ Sat i- h 


5 
720 30240 A 


h8 
A’ tee 13-14 
+ 1209600 + ( ) 
where 
Ar = [yf — yf] 
TABLE 3 
z 1/2 A a A? At 
1.0 1.000000 
1.2 0.833333 — 166667 
1.4 0.714286 — 119047 +47620 
1.6 0.625000 — 89286 +29761 — 17859 
1.8 0.555556 — 69444 +19842 — 9919 +7940 
2.0 0.500000 — §5556 +13888 — 5954 +3965 


Example 6. Divide the interval between 1.0 and 2.0 into five equal 
2.0 

parts and evaluate the integral J = f ~ (dæ/x) by the Euler-Maclaurin 
1.0 


formula. The required values of f(x) = 1/x are given in Table 3. We 
also need the derivatives of odd order which are 


—1)" 
fla) = Ya; fe) = 2 


Then f(A) =—1; fA) = —6; fQ) =~—120; f'(2) = —0.25; 
f” 2) = —0.375; fY(2) = —1.875. Since A = 0.2, we also find 42/12 = 
0.003333; h*/720 = 2.222 X 107°; A®/30240 = 24 X 10°; A = 0.75; 
A? = 5,625; Aë = 118.1; Ir = 0.695635. Hence, 


Tay = 0.695635 — 0.002500 +- 0.000012 — (2.8 X 1077) 
= 0.693147 


‘7 Those with even subscript only are required in the Euler-Maclaurin formula: 
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The fifth derivatives contribute to the final result only after six significant 
figures. A more exact value of the integral is In 2 = 0.69314719. 

13.10. Gregory’s Formula.—In case the explicit form of f(x) is un- 
known, we may rewrite (11), using (8) in place of the derivatives and obtain 
Gregory’s formula 


b Yo y. 
f fode = To =h B tot +2 


2 
h h 
-3 (Ayni — Ayo) — 55 (A*yn—-2 + A*yo) 
19h, 3h 
~ 790 (A°yn-3 — A®yo) — 160 (Atyn—4 + Atyo) — ++: 


(18-15) 


It should be observed that the contents of the parentheses are alternately 
differences and sums. Additional coefficients of —h(A’yn_, $ A"yo) may 
be found by evaluating the definite integral 


—t}y 1 
7 =S z(z — 1) — 2) +- (z ~ r)dz (13-16) 


Example 7. Evaluate the integral of Example 6 by means of Gregory’s 
formula. We find h/12 = 0.01667; 4/24 = 0.008333; 194/720 = 0.0053; 
3h/160 = 0.0038. Hence, 


To, = 0.69635 — 0.01667 (— 0.055556 + 0.166667) 
— 0.008333 (0.013888 + 0.047620) — 0.0053 (— 0.005954 + 0.017859) 
— 0.0038 (0.003965 + 0.007940) = 0.693163 


The result is not as precise as that obtained in Example 6 because of the 
small number of available differences. 

Problem. Evaluate the integral of Example 8, sec. 13.11, by the Euler-Maclaurin 
and Gregory formulas. Divide the interval into five equal parts. 

13.11. The Newton-Cotes Formula.—Instead of using differences, it 
is possible to rewrite (11) or (15) in terms of the ym since the A’y,, may be 

_ reduced to sums of Ym by means of (1). The resulting equation, called the 

Newton-Cotes formula is of the form of (10) where 


Ih pte — )@-2)-+-@-2) 
An = iy aid, a dz (13-20) 


Table 4 gives the A,, for several values of n. The values found in this 
way may be easily checked since it is necessary that 


Ag + Ai + Ao +--+ + An = mh (18-18) 
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When this method is used, it is simpler to divide the interval from a to b 
into a number of sub-intervals. The number of y,,’s in each of these deter- 
mines the appropriate A,, from Table 4. The value of the integral then 
equals the sum of the separate terms obtained by applying (10) to each sub- 
interval. We give a few special cases of the Newton-Cotes formula. 


TABLE 4 
n=l Ap = Ai = h/2 

2 Ap = Ag = h/3; A, = 4h/3 
3 | Ay = Ag = 3h/8; Ay = Az = 9h/8 
4 | Ag = Aq = 14h/45; Ay = Ag = 64h/45; Ap = 8h/15 
5 | Ao = As = 954/288; Ay = A4 = 1252/96 

Ag = Az = 125h/144 
6 | Ap = Ag = 41h/140; Ay = Ag = 54h/35 

Ag = Aq = 27h/140; Ag = 204/105 


a. The Trapezoidal Rule. If each sub-interval contains two values of 
Ym n = 1L, Ao = Ay = h/2 and? 


b 


This result is exact if the first differences of f(x) are constant. It will be 
noticed that (19) forms the principle term in both the Euler-Maclaurin 
and Gregory formulas. 


b. Simpson’s Rule. If there are un even number of Ym and we divide 
each sub-interval in two parts, we obtain, with n = 2 from Table 4, 
Simpson's One-Third Rule: 


Is -f {(e)de = 


h 
z o +4 Hys tee F Yad) + 2U t ya tee H Yna) + Yl 
(13-20) 


This is exact if second differences of f(z) are constant. It is probably the 
most generally useful of all quadrature formulas. 


8 In order to avora confusion, it should be noted that n has been taken with two 
meanings. In Table 4, it refers to the number of intervals between the lower and upper 
limits of the integral. It now refers to the number of divisions of the sub-intervals. 
As a subscript in (19), (20) and (21) it indicates the last available value of Ym as in 
previous equations. 
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c. Weddle’s Rule. Taking n = 6 from Table 4 and neglecting all 
differences above the sixth, we obtain Weddle’s Rule: 


° 3h 
Iw = f Sle)de = Flo + By + ya + Gue + vu + Sys 


+ 2ye + 5y7 + ys + 6yo + yo + Syn + Wiz +- 
+ yng + 5Yn—s + Yn + byn + Yn— + Syn—1 + Yn] (13-21) 


This is the most accurate of the formulas? in this section but it has the dis- 
advantage that the interval must be divided into a number of parts equal 
to six or some multiple of it. 

` Various other special cases of the Newton-Cotes equation may be 
developed. The best known of these, generally called Simpson’s Three- 
Eighth’s Rule is obtained from (10) and Table 4 with n = 3. As shown 
by Scarborough (loc. cit.), it is inferior to the One-Third Rule and should 
never be used. 
Lb 3 


Example 8. Evaluate f 


0 €E — 1 
This integral is of importance in the Debye theory of the heat capacity of 
solids ;° it cannot be evaluated in terms of other known functions. Values 
of the integral between 0 and n, with n from 0.01 to 24 in steps of 0.01 have 
been given to six places by Beattie;'! from his table, Z = 0.615495. 
Dividing the interval 0 to 1.50 into six equal parts, we obtain Table 5. 
Since h = 0.25, we find 


Ip = 0.25 X (1.991643 + 0.484678) 
= 0.619082 


dx by the three preceding methods. 


0.25 
Ig = = X (4 X 1.216979 + 2 X 0.774664 + 0.969357) 
= 0.615550 


3 X 0.25 
ly = _— X (5 X 0.839293 + 1.744021 + 6 X 0.377686) 


= 0.615495 


° Note that the last term in (21) has the coefficient unity if n = 6 or some multiple 
of 6 In deriving this formula, the coefficient of the term A®yo is 41/140, which is taken 
to be 3/10 in order to make the final form of the equation as simple as possible. The 
resulting error is negligible. 

10 See, for example, Taylor, H. S. and Glasstone, S., “ A Treatise on Physical Chem- 
istry”, Vol. 1, Third Edition, Chapter IY, D. Van Nostrand Co., Inc., New York, 1942. 

11 Beattie, J. Math. Phys. 6, 1 (1926). 
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It is thus seen that the trapezoidal rule is the least accurate of these three 
equations while Weddle’s rule and Simpson’s rule give nearly the same 
results. 


TABLE 5 


z | f(x) = #8/(e? — 1) 


0 
25 0.055013 
50 0.192687 
75 0.377686 
00 0.581977 
25 
50 


0.784280 
0.969357 


Problem a. Compute sonie of the coefficients of Table 4. 
Problem b. Divide the interval of the integral of Table 5 into twelve equal parts 
and perform the integrations by the three methods of this section, 


13.12. Gauss’ Method.—The method of Gauss not only determines the 
(n + 1) values of Am but also fixes the (n + 1)y¥m’s of (10) in such a way 


that the difference between f ¢(x)dx and | f(x)dx isa minimum. Since 


there are now (2n + 2) constants available, it follows that if f(x) is a poly- 
nomial of degree < (2n + 1), the method will give an exact result for the 
integral. It will be remembered that the Newton-Cotes method will be 
exact under similar conditions, if the degree of the polynomial < n, hence 
Gauss’ method will give a more nearly exact result than the Newton-Cotes 
method with the same number of values of Ym, or conversely the Newton- 
Cotes method requires a larger number of known values of the function 
than Gauss’ method for the same allowed error. This is a matter of some 
importance especially when the given values of Ym are limited in number 
as they are likely to be when they result from experimental measurements. 
In applying the method, it is convenient to change the limits of the 


integral f f(x)dx by making the substitution 
g=at (b— ap (13-22) 
hence in terms of the new variable v, the limits are0 and1. Then, 
J) = fla + b — aj] = FO) 


= (b — ajdi (13-23) 
and 


b 1 
Ie = f f(x)dz = (b ~ a) f F (v)dv (13-24) 
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Developing F'(v) in a series similar to (10), we have 

Tq = RoFo + Rifi + Rog + +++ + RaFy (13-25) 
where Fm means the numerical value of F(»,). Now it can be shown!? 


that the difference between f f(x)dx and Ig of (25) is made a minimum 


provided vm and Rm are determined by the relations 


Lk, = l; EY Rav = 4; E Rv, = 4; 
m= m0 m=0 
_ 1 

(r +1) 


Since there are (2n + 2) constants to be evaluated, the most direct pro- 
cedure would be to solve simultaneously (27 + 2) equations like (26), 
This, however, is very laborious even for small values of n but the Vm alone 
may be found in the following way. Let zo, 21, 2, ---, Zn be the (n+ 1) 
real roots of the Legendre polynomial!? Pri of degree (n +1) obtained 
from the equation P,.41(z) = 0. Then, 


E Ry’, (13-26) 
med 


w= tz) 4 =F+a), +, m =H +e) 03-27) 


With the (n + 1) values of vm determined in this way, it is a simple matter 
to find the remaining constants Rm, (n + 1) in number for it is only neces- 
sary to solve simultaneously (n + 1) relations like (26). Values of both 
Um and Rm are given in Table 6.14 

Some writers make the substitution 


_ (a+b) (b-a) 
a t 


which changes the limits of the integral to +1. In this case, 


b—a +1 b— n 
l = 2 f" gaya = 22 È Pagten) 
~l mad 


where g(w) corresponds to the former Fv) and 
Tm = Rm; Wm = 2m — 1 
1? The proof is given by Hobson, E. W., “ The Theory of Spherical and Elllipscidal 
Harmonics,” Cambridge Press, 1931, 
18 See sec. 3.3. 


H More extensive lists may be found in “ Tables of Lagrangian Interpolation Co- 
efficients,” Columbia University Press, New York, 1944. 
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It will be observed that in Gauss’ method, the interval is not subdivided 
equally as in the preceding cases but it is divided symmetrically about the 
mid-point. 

TABLE 6 


m= 2 | vp = 011270166 | Ro = Ro = 5 
yn = 0.5 
v = 088729833 Ry = 4 


3 vo = 0.06943184 Ro = Rg 


= 0.17392742 
vı = 0,33000948 Ry, = R: = 0.32607258 
v2 = 0.66999052 “ 
v3 = 0.93056816 
4 vo = 0.04691008 Ro = Ry 11846344 


= 0. 
vı = 0.23076534 Ry = Rg = 0.23931434 
w= 0.5 Ro = 0. 28444444 
v3 = 0,76923466 
vu, = 0.953808992 


5 vo = 0.03376524 Ro = Rs = 0.08566225 
vı = 0.16939581 R, = R4 = 0.18038079 
ve = 0.38069041 Re = Rg = 0.23395697 
va = 0.61930959 
v4 = 0.83060469 
us = 0.96623476 


Example 9. Apply Gauss’ method to the integral of Examples 6 
and 7, subdividing the interval into four parts. From (22) and (44), we 


finds = 1+vandlg = f Foe. From Table 6, withn = 3, we obtain 


Fo = 1/1.069432 = 0.935076; Fy = 1/1.330009 = 0.751875; Fp = 
1/1.669990 = 0.598806; Fs = 1/1.930568 = 0.517982. Then, 


Ig = 0.173927 x (0.935076 + 0.517982) + 
0.326072 X (0.751875 + 0.598806) 
= 0.693145 


The result is as precise as that obtained by the Euler~-Maclaurin or Gregory 
formulas but entails much less work. 
° da2 — 198 
Problem a. Find values of un and Rm forn =2. Hint: Prlz) = ae =Q. 
Problem b. Evaluate the integral of Example 6, sec. 13.9 by Gauss’ method, 
Use the limita 1.0 and 3.0, subdividing this interval into four divisions. 


13.13. Remarks Concerning Quadrature Formulas.—The selection of 
the most suitable quadrature formula to use in a specific case is a matter for 
which no general rules can be given. When the explicit form of f(x) is. 
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known and the differentiations easily made, the Euler-Maclaurin formula 
has the advantage of giving a result to any required number of figures. 
When the explicit form of f(x) is not known or if it cannot be differentiated 
easily, Gregory’s formula is useful. As previously stated, the Newton- 
Cotes formula and its special cases such as the trapezoidal rule, Simpson’s 
and Weddle’s rules are approximations to the Euler-Maclaurin and Gregory 
formulas; they have the advantage of requiring less labor to apply than 
the two former but result in a loss of accuracy. Gauss’ method is appar- 
ently not used as often as might be expected in chemical and physical 
caleulations. Since calculating machines are commonly used in such work, 
the application of it is not laborious and the resulting precision should 
recommend it. 

The reader should remember that in approximate quadratures, the 
integrand is being replaced by a polynomial, the latter instead of the origi- 
nal function then being integrated. It thus follows that the reliability of 
the result is determined by the fidelity with which the approximating poly- 
nomial matches the given function. Since Gauss’ formula fits a poly- 
nomial of given degree with fewer known points than any of the other 
formulas, it should be preferred when the function is of such a form that it 
can be used. Even if the explicit form of f(z) is unknown, Gauss’ for- 
mula may still be applied but it requires interpolation between the given 
Ym to find the proper F(v). When the ym are the results of experiment and 
can be arranged at will, Gauss’ formula in fact prescribes their optimuin 
positions as those determined by the vm- 

One caution regarding quadrature formulas should be mentioned. If 
the graph of f(x) is such that the area under one portion of the integral is 
much larger than that under another portion, the integral should be 
evaluated separately for each area. The value of h for the sub-interval 
contributing the least amount to the final result may then be taken as a 
larger quantity than the A-value for the remaining sub-intervals. I 
nothing is known of the behavior of f(r), a graph should always be drawn. 


NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS 


13.14. Introduction.—One often encounters differential equations which 
cannot be solved by any of the methods described in Chapter 2, except 
that of solution in terms of a series, and this method may be difficult to 
apply in certain cases. Even when an analytical solution is available, it is 
sometimes not easy to find numerical values of corresponding pairs of the 
dependent and independent variables. For example, if the initial con- 
ditions to = 0, yo = 1 are given for dy/dz = (y — x)/(y + 2), the solu- 
tions is 4 In (a? + y*) + tan”! y/z = 7/2 but the labor of finding values 
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of z for given values of y will be very great. In cases of this kind, it is 
possible to proceed by graphical! or numerical methods. The object of 
the latter is to obtain a table of z and y over the range of x required by 
the particular problem at hand. When a few such values are known, the 
table may be extended rather easily, as will be shown. Special methods 
are required for finding the first few values of x. We present four different 
ways of starting the solution of a differential equation by numerical meth- 
ods, and then show how the solution may be continued by extrapolation. 

13.15. The Taylor Series Method.—Suppose a differential equation 
of the first order is given: 


dy 
oF = fay) (18-28) 
with initial values x = zo, y = yo. We may then write the Taylor series 


T — 29)? r — x)" 
y= wt = muh + ES yl + EL o) ye 


(x — To)” (n) 


poH =o yi (13-29) 
n! 


If it is possible to find the various derivatives, the calculations may be 
extended to as many values of x as desired. 
Example 10. Start the solution of the differential equation 


= oF y (13-30) 


with initial conditions, x9 = 0, yo = 0. The exact solution of (30) is 
found by the methods of Chapter 2 to be y = ze”; the reader will recog- 
nize that it is of the form of the differential equation occurring in the study - 
of radioactive disintegration and in the kinetics of chemical reactions 
involving consecutive first order decompositions. Since y’ = €” — y, 

Me g yl ey = (-1)e* — y", it follows that yf? = 
(—1)* n and from (29), 

oot č z 


ape gear S 
yore ta eto jot 


16 For graphical methods, see Levy, H., and Baggott, E. A, “ Numerical Studies 
in Differential Equations,” Vol. 1, Watts and Co., London, 1934, or Sherwood and 


Reed, “Applied Mathematics in Chemical Engineering,” McGraw-Hill Book Oo., 
New York, 1939. 
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Taking z as 0.1, 0.2 and 0.3, we find the results which appear in Table 7. 


TABLE 7 
Exact 
z Y Values of y 
0 0 0 
0.1 0.0905 | 0.09048 
0.2 0.1637 0.16375 
0.3 0.2222 | 0.22224 


While the method is very simple, it is often tedious to apply as the 
successive derivatives may become difficult to handle and even at x = 0.3 
in this case, the fifth derivative is needed. However, it would appear that 
this procedure is preferable to any other in finding the first few values of y 
when it is possible to use it. 

13.16. The Method of Picard (Successive Approximations or Itera- 
tion).—From (28), we see that a solution may be found in the form of an 
integral equation 


oc xz d 
Y = Yo + E = yo + f (ae (13-31) 


An approximate solution of this equation may be made by assuming that 
y = Yo under the integral sign. The integral may then be evaluated (by 
quadratures, if necessary) since it is only a function of x and the constant 
Yo. Denoting this first approximation to y by !y, 


‘Y = yo + f f(e,yo)dx (13-32) 
xy 
The process may be repeated to give 


y= m+ f feude (13-33) 


and so on. 
Example 11. Start the solution of Example 10, sec. 13.15 by this 
method. 


lyt = Yo + f (e7 — yo)dx 
0 


T 
= f| isie 
0 
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2, — 7 Ce 1 
yı = yo + J (e yjdz 


= fe — Ide = 2(1 —e7)— z 
0 


a(4 — x) 


y = 801 — €*) - z 


With z = 0.1, 3y, = 0.0906; the next approximation, 4y, is the same as 
3y,, hence we proceed to calculate yz at = 0.2 from the relations 


tys = yy + fe — 3y,)dz 
o. 


= 3y + el + 0.13y, — 677 — 0.09062 
= 1,0045 — ¢* — 0.09067 


Fy. = Sy, + fe — lyz)dz 
at 


% 


t 
Sya = n + f (0 — 'ua)de = 0.1689 
0.1 


The next value, ‘ye is the same as ?yz so we go on in the same way to find 
lyg, etc., at x = 0.3. The results by the Picard method are seen from 
Table 7 to be not quite as good as those obtained in Example 10. Moreover, 
the disadvantages here are similar to those of the method of sec. 13.15, for 
the successive integrals may become more and more difficult to determine. 

13.17. The Modified Euler Method.—If the intervals between succes- 
sive values of z are small enough we may write Az = A and 


Ay = (24) Ag (13-34) 
dx 
An approximate value of y1 at tı = to + h is then given by 
d 
lyi = yo + Ay = Yo + (*) h (13-35) 
0 

An approximation to dy/dz at x1, may be obtained by the relation 

1 

d , : 
(ž 7 Feti) (13-36) 


which leads to an improved value of yı 


hf? 20) w) | 
, my =y tl (2 Y 13-37 
Yı = Yo + Al ds), + iz) ( ) 
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This in turn will give a better approximation to dy/dz 
? #) 2 
Zj) = 13-38 
dt 1 F(a, y) ( ) 
which may be used to compute the third approximation to y1. The process 
is repeated until there is no further change in the results. The values of y 
and dy/dx at x are found in a similar way. 
Example 12. Start the solution of (80) by this mcthod. Since 
To = Yo = 0, (dy/dx)o =]. Withh = 0.1, 
lay =1x01=0.1 
1 (dy/dz), = e~®? — 0.1 = 0.8048 
2y, = 0.1(1.0 + 0.8048) /2 = 0.0902 
2 (dy/dr), = 0.9048 — 0.0902 = 0.8146 
34, = 0.1(1.0 + 0.8146) /2 = 0.0907 
3 (dy/dx), = 0.9048 — 0.0907 = 0.8141 
ty, = 0.05(1.0 + 0.8141) = 0.0907 
No further improvemexit results by continuing the approximations, so we 
proceed to z = 0.2 with y, = 0.0907, (dy/dx); = 0.8141. Then, 
tya = 0.0907 + 0.8141 X 0.1 = 0.1721 


i(dy/dz)a = € ©? — 0.1721 = 0.6466 
and finally, 
fya = 0.1641, 2(dy/dz)o = 0.6546 


This method is tedious in application but perhaps less complicated than 
either of the preceding methods since neither differentiation nor integra- 
tion is required. 

13.18. The Runge-Kutta Method.—In this method it is necessary to 
calculate the four quantities 


kı = f(xo,yo)h 
h k 
h= i(m +4, w+ Z)h 


h k 
= s(m+4, w+) (18-89) 
ke =flto th, Yo + ka)h 
Then, 
t= To th, yr = yo + dy 
Ay = $ (kı + 2k + 2kg + k4) (13-40) 
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It will be noted that if f(x,y) is independent of y, (40) reduces to Simpson’s 
rule. The same formulas are used to compute y at 22, substituting x, and 
yy for zo and yo in (39). 

Example 13. With the same differential equation as before (Example 
10, sec. 13.15), we find 


ky = (€°? — yojh = 0.1 
ka = (e205 — 0.05)0.1 = 0.0901 
kg = (6° — 0,0450)0.1 = 0.0906 
kg = (&°! — 0.0906)0.1 = 0.0814 
Hence, 
Yi = yo + Ay = $(0.1 + 0.1802 + 0.1812 + 0.0814) = 0.0905 


For the next interval, we find in a similar way, kı = 0.0814, ka = 0.0780, 
kg = 0.0734, ky = 0.0655, Ay = 0.0733, yo = 0.09055 + 0.0733 = 0.1638. 

The error in the Runge-Kutta method is of the order of h”. It will be 
seen that its use is reasonably simple; it is probably the most generally 
useful of the four methods given here. 

13.19. Continuing the Solution.—When the first few values (three or 
four) of y have been found by one of the preceding methods, the solution 
may béicontinued by extrapolation. For this purpose, it is appropriate 
to use Newton’s interpolation formula (4), rewriting it in terms of: 
y’ = dy/dz, yi, and the differences Ayk- A?’ypo, ++, Ayin Upon 
substituting this expression in the equation 


T2 
y= f y'dz (13—41) 


1 


and performing the integration, several useful formulas may be obtained 
by changing the limits of the definite integral. 


(Ayit = hly tiay- Higa Y + BA ya tA Atya) (03-42) 
(Ay)} -1 = A(yk— Ayki — rA Yk- — ggi yk- — pfs 4*yi-4) (13—43) 
(Ayi = Ayk — Ayki HA ya H rA Ya + ety Atyi_«) (13—44) 
(AETS = blyk — Aye + HA yi» — BA Uka — arauka) (13-45) 
(Ayki = hy — Ayk- HERA yia iA ys HHA yka) (13-46) 
The meaning of a symbol such as (Ay)?*! should be clear. It is the incre- 
ment to be added to the k-th value of y in the difference table to obtain the 
next value beyond, that is, the value of y at zp41. Equation (42) is thus 
+ 


to be used for extending the table to larger values of x while the remaining 
formulas are useful in checking the values of y already found. 


13.19 NUMERICAL CALCULATIONS 488 


Example 14. Extend the integration of the differential equation of 
Examples 10, 11, 12 and 13, using the values of y’ found in Example 12. 
We first collect the data as shown in Table 8. To check y at x = 0.1, 
let us use (44), Then since & = 3, 


(Ay)) = 0.1(0.6546 + 30.1595 + 45,0.0264) =0.0905 
Thus, yı = yo + (Ay) = 0.0905, which shows that the result in Table 8 


is in error by 2 units in the last place. Similarly, to check yo, we use (43) 
to obtain 


(Ay)? = 0.1(0.6546 + 0.0798 — 0.0022) = 0.0732 
and 
yo = yı + (Ay)? = 0.0905 + 0.0732 = 0.1637 


We now make a new table (Table 9) to include our corrected values of 
y, y', dy’, etc. To find y3, we use (42) to obtain 


(Ay)3 = 0.1(0.6550 — 0.0796 + 0.0110) = 0.0586 
y3 = 0.1637 + 0.0586 = 0.2223 
A check on yg may be found from (43) 
(Ay) = 0.1(0.5185 + 0.0682 — 0.0011 + 0.0005)..= 0.0586 


TABLE 8 


z y y' Ay! Ay 


0 0 1.0000 

0.1 0.0907 0.8141 —0.1859 

0.2 0.1641 0.6546 —0.1595 +0.0264 
a Se R Sa nLann en SOE 

TABLED. 

iera a U 

z y Ay y’ Ay’ Ay’ Ady! 
aaan | Amit nina | asain | neni | petits ticipate 


0.00 0 1.0000 

0.10 0.0905 0.0905 0.8143 —0.1857 

0.20 0.1637 0.0732 0.6550 —0.1593 +0.0264 

0.30 0.2223 0.0586 0.5185 —0.1365 +0.0128 —0.0136 
a a AO RnR S nNOS ERE 
Since this is the same result as that found previously, we proceed to the 
next value of z. Moreover, since the preceding y was correct at the first 
trial, we suspect that the value of A might be increased, say to 0.20. We 
thus obtain y for z = 0.40 in the same manner as before, then rewrite the 
table for x = 0, 0.20 and 0.40. From the new table, we go on tox = 0.60, 
etc. - 
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13.20. Milne’s Method.—One further method of continuing the solu- 
tion of a differential equation is often useful. Supposing the first four 
values of y and y’ have been found by some of the previous methods, we 
continue as follows. 

1. Find a first approximation to the next y by using the formula 


4h 
Ty, = Yaa t 3 (yi1 — Yaa + 2yp--3) (138-47) 


2. Substitute this in the original differential equation (28) to find yj. 
3 Use the value of y; to calculate *y, from the relation 


} 
yg = Yew + 5 (yt + tyri + ys) (13-48) 


If ty, and 7y, agree to the desired number of figures, we may proceed to 
the next interval in the same way. If they do not agree, the size of the 
interval must be decreased. The error due to the use of (48) is 
E = wy | ue — ‘We | 

Egs. (47) and (48) are obtained by integrating Newton’s interpolation 
formula (3), after expressing it in terms of y’. Both formulas are exact 
when fourth differences of y’ vanish. 

Example 15. Use Milne’s method to continue the solution of the 


differential equation of the previous examples. Forz = 0.4, we find using 
Table 9 and (47), 


0.40 


ly, = 3 (2 X 0.5185 — 0.0550 + 2 X 0.8143) = 0.2681 


From the original differential equation (30) 
y4 = (0.5703 — 0.2681) = 0.4022 
From (48), i 
0.10 , 
244 = 0.16387 + a (0.4022 + 4 X 0.5185 + 0.6550) = 0.2681 
Problem. Use the various methods of this chapter to obtain the solutions, cor- 


rect to four decimal places, of the differential equation dy/dx = (x — y) between 0 
and 0.25, with zo = 9, yo = L The exact solution is y = (¢ — 1) + w. 


13.21. Simultaneous Differential Equations of the First Order.—Sup- 
pose the given equations are 


d 
T = fun) 


d 
T = fley) (13-49) 
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where x is the independent variable and y, z are dependent variables. 
Provided initial values of z, y and z are given, the first increments in y and z 
due to an increment Ar, in z may be found by any of the methods given in 
the preceding sections. The procedure should be obvious, but it is particu- 
larly necessary to check the results carefully at each stage of the solution. 
If the Runge-Kutta method is used, the following equations replace (39) 
and (40) 


ki = fi(%0,Yo,20)h 
h k m 
t =h (244, Yo +>; a + SYR 


alag mak m 
CE Yo +>, Zo + a) a 


kg = fi (£o + h, yo + ks, zo + mg)h 
m = fo(Xo,Yo,20)h 


h k m 
m = fo (z + z Yo + > Zo + > h (13-50) 


k k m 
m=i vot 5 sot) 


Ma = falto th, Yot ks, zo + mah 

2i = zo th; Yı = yo t Ay; a = zo + Az 

Ay = $(ki + 2k2 + 2kg + k4) 

Az = ¢(m + 2m + 2m + m4) (13-51) 
13.22. Differential Equations of Second or Higher Order.—Any differ- 


ential equation of second or higher order is reduced to a system of simul- 


taneous equations by the introduction of new variables. Consider the 
equation 


TY = Hayy’, yD) (13-52) 
dz” tay ? J 
where y’ = dy/dz, y” = d?y/dz?, ete. Make the substitutions 
dy dz n= 
asi Tgi h amaS a (18-53) 
then, 
d'y dey, 
da” = Zn = > = J (EUes . nm) (13-54) 
Provided initial values of z, y, 21, 22,+++, Zn are given, the problem is 


equivafent to the solution of a system of simultaneous first order differential 
equations which may be effected as described in sec. 13.21. 
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In physical problems, differential equations of the type 
dy 
dr? + fay) = 0 


æ 

TY + pfam 2) =0 
often arise, with the requirement that the variables satisfy certain bound- 
ary conditions, say 7 = Zo, Y = yo and z = 23n, Y= Y with the initial 
value of dy/dz unknown. For example, in the Thomas-Fermi theory! of 
the atom, the equation @y/da* = (y8/x)"? occurs with the boundary con- 
ditions, x = 0,y=1,2= %, y= Q. In cases of this kind, a tentative 
value of dy/dx is assumed and a rough integration is made over the range 
of x. This first approximation will usually suggest a better guess for 
the initial value of dy/dz. After several attempts are made, the value of 
dy/dx may usually be found to the desired accuracy. 

Example 16. Find y and dy/dx for the equation 

dy dy 

dz? + Aa dz 
Let dy/dx = 2, then the second order equation is equivalent to the first 
order equations 


or 


4y = 0 


dy 

dz 
which may be solved by the previous methods. If the Runge-Kutta 
method is used, fı (2,4,2) = 2 and fo(a,yz) = — 4x2 + 4y. In this case, 
fı does not depend on % and y, & situation which makes the evaluation of 
the k’s in (50) somewhat simpler than in the general case. The differential 
equation of this problem may be solved exactly by the substitution 


ae 


y = ve 


= z; Ë tog — 4 = 0 
7 dz y = 


PART 2. ALGEBRAIC CALCULATIONS 


13.23. Numerical Solution of Transcendental Equations.—No general 
method exists for finding the roots of transcendental equations such as 
ze = 1 or z? = sinz. Approximate values may always be found by 
graphical means; where more precise results are required several analytical 
procedures are available. 

a The Method of “ Regula Falsi.” Suppose the given equation is 
f(x) = 0, then it is obvious that the plot of y = f(e) will give the required 

18 The differential equation and ita solution are diseussed in more detail by Gombas, 


P., “ Theorie und Lésungsmethoden des Mehbrteilchenproblems der Wellenmechanik,” 
Birkhäuser, Basel, 1950. 
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root when y = 0, that is when the graph crosses the z-axis. Two values 
of x, say zo and z, with the corresponding values of y are selected from 
a graph or otherwise. Then if zo is near the root desired, a better approxi- 
mation for the root is given by 
£ = £o + Ax 

where 

(zı — 20) | Yo | 
| Yo | + |y l 


The process is continued until the required number of figures is obtained. 

Example 17. Find the solution™” of f(x) = (5 — zje — 5 = 0 nem 
z = 5. One solution is clearly z = 0; to find the other let zo = 4.5, 
zı = 5.0; yo = 40.00, yı = — 5.00, hence 


Az = (13-55) 


0.5 X 40.00 
lag = = 0.44; des 44 = 4. 
Ax 45.00 44; ‘x = 4.50 + 0.44 = 4.94 
A second approximation with zo = 4.94; yo = 3.382 gives 
.0 3.382 
2 Ag = — = 0.024; -2r = 4.94 + 0,024 = 4,964 


The third approximation with za = 4.964; yo = 0.1516 gives 


0.036 X 0.1516 


3 Ar = 
5.1516 


= 0.001; ĉr = 4.964 + 0.001 = 4.965 
Further repetition of the calculations show that this result is correct to four 
significant figures. The value 4.965114 has been obtained by Birge.!® 


b. The Newton-Raphson Method. When the derivative of f(x) is easily 
evaluated numerically, the real roots of f(z) = 0 may be determined in the 
following way. Suppose zo is an approximate value of one of the roots, 
then an improved value of the root is given by 


f (20) 
f (x9) 


The next approximation is found by substituting x in place of zo to get a 
new value of Az, continuing in this way as long as necessary. In practice, 
it will be found that after a few approximations, the value of the derivative 
will change very little with succeeding values of x hence f’ need not be 
recomputed. 


z = zo + åz; Ax = — (13-56) 


U This equation occurs in the theory of black-body radiation, see, for example, 
Taylor, H. 5., and Glasstone, S., “A Treatise on Physical Chemistry,” Vol. 1, D. Van 
Nostrand Co., Inc., New York, 1942. 

18 Birge, R. T., Revs. Mod. Phys. 18, 233 (1941). 


493 SIMULTANEOUS EQUATIONS IN SEVERAL UNKNOWNS 13.24 


Example 18. Find x of example 17, starting with zo = 4.9. Sub- 
stitution gives f(to) = 8.43; f'(to) = — 120.87; Ar = 8.43/120.87 = 
0.07; tx = 4.97. The second approximation is obtained from f(4.97) = 
—0.677; f'(4.97) = 139.78; Az = — 0.677 /139.78 = — 0.005; 27 = 
4.965. 


c. The Method of Iteration. If we rewrite our equation f(z) = Oin the 
form 
x = (x) (13-57) 

we may substitute an approximate value of x, say To on the right of (57) 
to get ‘x = (zo) and repeat to get 

2p = (zı); "r = (22); ete. (13-58) 
It is often possible to write f (z) = 0 in the form z = (z) in several 
different ways, in which case, it is better to start with the simplest such 
arrangement. A few approximations will indicate whether the chosen 
form is suitable but if the guceceding values of z do not converge rapidly, 
one of the alternative functions should be tried. The condition for con- 
vergence is found to he that ọ' (z), the derivative of x, be less than unity in 
the neighborhood of the desired root. As this derivative becomes smaller, 
the convergence becomes more rapid. 

Example 19. Find z of the function in Examples 17 and 18 by the 
method of iteration. Writing the equation in the forms = 5e7* (e — 1) 
we find with zo = 4.9; e7 = 134.3; ly = (5 X 133.3)/134.3 = 4.963. 
The next approximation gives 67 = 143.1; ’x = (5X 142.1) /143.1 = 

Problem. Solve the equation x log = = 1.5334 by the methods of this section. 

Ans: & = 3.1110. 


13.24. Simultaneous Equations in Several Unknowns.—The real roots 
of simultaneous algebraic or transcendental equations may be found by the 
methods of secs. 13.23b or 13.230. In the Newton-Raphson method, 
when two equations are given 

fey) = 9; g@y) = 0 (13-59) 
(56) is replaced by ; 
x = Xo + AZ; Y = Yo + AY 
where 
_i | —f(toyo) — Fu(tosyo) 
A | —g(to,Yo)  9u(Zo,4o) 
Ay = 1 fa(2o:y0) —f(xo,Yo) | 
A\  ge(toyo) —9 (toro) 
| fe(ZosYo) fy (20,40) | 
gz(2o,Yo) gy (o,Yo) 


13.26 NUMERICAL CALCULATIONS 494 
In the method of iteration, we rewrite (59) as 


x= o(zy); y = Yzy) 
then 
*z = $(to,yo); `y = ¥Cx,yo) 
7c = o('a,'y); "y = pEr y); ete (13-60) 


Both methods are readily extended to cases of more than two unknowns. 

13.25. Numerical Determination of the Roots of Polynomials.—Any 
of the methods of sec. 13.23 may be applied to determine the real roots of a 
polynomial, When all of the roots are not required, the Newton-Raphson 
method is probably more rapid than the others.!® In order to evaluate 
f(z) and f'(x) for z = zo, the following procedure will be found useful. 
Suppose the polynomial is y(z) = coz” + cx" > +--++ en. Write the 
coefficients in a line, supplying zeros if any powers of z are missing. Multi- 
ply the number co by zo and add the result to c,; multiply this sum (d,) 
by 29 and add to cz continuing until the last sum is obtained; its value 
equals y(x) for v = zo, The scheme is illustrated in Table 10. In actual 
computation with a calculating machine nothing need be written down 
since with proper care to locate the decimal point and due regard to sign, 
the whole process may be performed as a continuous operation. The use 
of this method is illustrated further in the last part of Example 20. 


TABLE 10 
Co Cy C2 C3 see Cn 
Coto dizo dato tee dn—1%0 
dy d ds toe dy, 


Graeffe’s root-squaring method will be found to involve little more labor 
than the preceding method with the added advantage that it gives all of 
the roots of the polynomial at once. No initial approximation is required 
and complex as well as real roots may be found. It is convenient to divide 
by the coefficient of z” if necessary so that the polynomial appears in the 
form y(z) = z” + ax) + age"? +--+ + a, = 0. Using detached co- 
efficients, Table 11 is calculated. Care must be taken with the signs of the 
doubled cross-products. The new coefficients bı, be, ---, are then squared 
and the cross-products of the b’s determined in a similar way. As the 
squaring process is continued, it will be found that the doubled crogs- 
products become progressively smaller, eventually contributing nothing to 
the next squared terms. When this point is reached, there will be n coeffi- 


19 Horner’s method does not appear to have any advantages over the Newton- 
Raphson method. It is described by Mellor, J. W., “ Higher Mathematics for Students 
of Chemistry and Physics,” Longmans, Green and Co., New York, 1902, and in most 
elementary algebra texts. 


495 NUMERICAL DETERMINATION OF THE ROOTS OF POLYNOMIALS 13.26 


cients, say M1, M2,°-*, Mn. Thenif 2, r2,-- +; £n are the n real roots of the 
polynomial 
m Mn, 
| z |? = Mı; | z2 |? = —; h | tn |? = 
my Mn) 
or, 
log | z1 | = -log m 
p 
1 
log | Xo l = m (log ma — log mı) 
1 
log | T3 | = p (log ma — log mo) 
1 
log | za | = p (log m, — log my—1) (18-61) 


where p = 2° and s is the number of times the squaring operation has been 
performed. The signs of the roots must be determined by some rule of 
signs but this may often be done by inspection. 


TABLE 11 
l ay ay i ag a4 
l ay a a3 a 
— 2a —2a1a3 —Qacdd —2aza5 + 24206 
+2a4 +2005 ~~ 20107 
~~ 20g +-2ag 
1 by be ds ba 


In practice, it is best to carry only four or five figures in the calculations, 
hence tables of squares and four-place logarithms may be used if a calcu- 
lating machine is unavailable. If more figures are required in the roots, 
the use of- the Newton-Raphson method serves both to give these addi- 
tional figures and to check the previous calculations. 

When two (or more) roots of the polynomial are real and equal, one of 
the doubled cross-products will not decrease in magnitude as the squaring 
proceeds; in fact it will always be equal to one-half of the squared term 
which stands just above it. The squaring in this case is stopped when 
the other cross-products no longer contribute to the next coefficients. 

The presence of complex roots in a polynomial expression is revealed 
by the fact that the doubled cross-products do not disappear and the signs 
of some of the sums alternate as the squaring proceeds. The method of 
finding the complex roots as well as pairs of real roots is described in detail 
by both Scarborough and by Whittaker and Robinson (loc. cit.). 
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TABLE 12 
1 —5.600 X 10 4.900 X 10? 1.111 X 104 —1.175 «105 
3.136 x 108 2.401 x 108 1.234 x 108 1.381 X 1079 
—0.980 12.44 1.152 
—2. 350 
1 2.156 X 10° 1.250 x 108 2.386 X 108 1.381 X 101° 
4.648 x 10° 1.562 X 1012 5.693 X 1036 
~2,500 —1.029 —3.452 
0.028 
1 2.148 x 108 5.610 x 101! 2.241 x 1638 1.907 X 107° 
4.614 x 10! 3.147 X 10% 5.022 x 1032 
—1.122 —0.963 —2.140 
0.004 
1 3.492 x 10? 2.188 X 1078 2.882 X 10% 3.637 X 1049 
1.219 x 1075 4.787 X 104 8.306 x 1054 
—0. 044 —0.201 —1,591 
1 1.175 x 1075 4.586 x 1048 6.715 xX 1054 1.323 x 108! 
1.381 x 105° 2.103 X 10°% 4.388 x 1019 1.750 X 10!6? 
1 1.904 X 10199 4.414 x 10188 1.925 X 10749 3.062 x 10374 


Example 20. Find the four real roots of the polynomial”? 
ylz) = zt — 56r? + 490r? + 11,1122 — 117,495 = 0 


The method is apparent from Table 12. It will be seen that the second row 
of doubled cross-products may be neglected after the eighth power terms 
and the first row after the thirty-second power terms, hence the squaring 
is stopped after raising the coefficients to the sixty-fourth power. Wethen 
find that 


log | zı | = 100.2797/64 = 1.5669 
log | z2 | = (186.6448 — 100.2797)/64 = 1.3494 
log | za | = (72.6396) /64 = 1.1350 


log | z4 | = (65.2016) /64 = 1.0188 
so that | zı | = 36.89; |z| = 22.36; | 23 | = 13.65; | 2, | = 10.45. 
Inspection shows that all signs are positive except that of x3. With these 


20 Solution of similar equations is needed to calculate the energy levels of the asym- 
metric top in quantum mechanics; see, for example, Herzberg, G., “Infrared and Raman 
Spectra of Polyatomie Molecules,” D. Van Nostrand Co., Ine., New York, 1945. 
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values, the sum of the roots is 56.05, in approximate agreement with the 
coefficient of x? in the original equation. 

In order to improve these values, we make use of the Newton-Raphson 
method. With zı = 36.89, we find 


y(t) = [(1 X 36.89) — 56] + [(—19.11 X 36.89) + 490] 
4+ [(—214.97 X 36.89) + 11,112] + [(8,181.76 X 36.89) 
— 117,495] = —120 
In the same way, from 
y (x) = 4r? — 1682? + 9802 + 11,112 
we find 


y’ (a1) = [(4 X 36.89) — 168] + [— (20.44 X 36.89) + 980] 


+ [(225.97 X 36.89) + 11,112] = 19,448 
Then, 

Az, = 120/19,448 = 0.0062 
and 

17, = 36.89 + 0.0062 = 36.8962 


Repeating the calculations, we obtain. y (tz) ='3.57; y’ (tay) = 19,478; 
Alp, = —0.0002; ?zı = 36.8960. This value is correct to five significant 
figures. The same procedure applied to the other roots gives 22.3410; 
—13.6669; 10.4302. The sum of these values which is 56.0005 gives a 
further check on the results. ; 


Problem. Find the roots of z? — 15a? + 742 — 120 = 0, by the Graeffe method. 
Ans.: £t = 4, 5, 6. 


13.26. Numerical Solution of Simultaneous Linear Equations.— 
Systems of the form 


Ean: =g G= 1,2, n) (13-62) 
k=1 


where the a,; and g: are numbers and the 2 are sought, often occur in 
physical problems, particularly in the solution of the normal equations 
resulting from a least squares treatment of numerical data (see sec. 13.37b). 
Several methods of solving such equations are given by Whittaker and 
Robinson (loc. cit.) but none of these are particularly suitable for machine 
calculation (see also sec. 10.9). When asi = ix, which is usually true, 
the determinantal method described there offers certain advantages but in 
general when the number of unknowns is greater than four or five the 
labor of evaluating the determinants becomes prohibitive. The following 
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systematic procedure?! which is well adapted for machine calculation will 
be found useful in such cases. 

Using detached coefficients, the numbers in (62) are written down as in 
Table 13. For convenience we assume that there are only four unknowns; 


TABLE 13 
91 g2 g3 - 
a a2 213 —4a22/021 —ag3/421 
(A) ax, don 23 1 0 (B) 
azı a32 033 0 l 
t 1 
92 #3 = - 
bu Bye —Doo/be1 
bs bee 1 
gs - 
ch 


extension of the method to a larger number may be made without difficulty. 
Choose some unknown, say xə for elimination. Divide the numbers of the 
corresponding row of (A) by the first number in that row (we indicate it 
with a star) and add one’s and zero’s as shown to form (B). Now con- 
sider gr, g2, 93 as a row matrix and multiply the columns of (B) by this 
row (see sec. 10.6). The results are g} and g3. For example, g3 = gı X 
(—ag2/an1) + go X 1+ 93 X Oand 93 = gı X (—023/821) + g2 X 0 + 93 
x 1. Multiply rows of (A) by columns of (B), omitting the starred row 
of (A). This gives the numbers b; Again star an element and repeat 
the process until the last unknown iseliminated. The values of x are then 
given by 

a = g3'/e11 

T2 = (ga — biti) /ber 


z3 = (gı — tı — as123)/ü21 (13-63) 


Some care must be exercised in the order of elimination of the 2’s, especially 
if they are of widely different magnitudes. It is always advisable to begin 
with the smallest one, proceeding with the elimination in order of increasing 
magnitude. If this is not done, the cumulative errors in the calculations 
will produce unsatisfactory values of the unknowns. 


21 See Frazer, R. A., Duncan, W. J., and Collar, A. R., “ Elementary Matrices,” 
Cambridge University Press, 1938; Jeffreys, H. and Jeffreys, B. S., “Methods of Mathe- 
matical Physies”, Second Edition, Cambridge University Press; 1950 and Milne, loc. cit. 


499 EVALUATION OF DETERMINANTS 13.27 


Example 21. Fit the data of Example 3, sec. 13.3 to an equation of the 
form € = 2% + tet + z3. The three simultaneous equations become 


zı + 630.522 + 3.975 X 10°x3 = 5.535 


zı + 960.522 + 9.226 x 10°, = 9.117 
zı + 1063.072 + 11.300 x 10°xg = 10.301 (13-64) 
TABLE 14 
5.535 9.117 10.301 — — 
ae 
1 1 J 1 —2,32101 | — 2.84277 
630.5 960.5 1063.0 1 + 4) 
3.975 x 105* 9.226 x 108 11.300 x 10° 0 1 
tO ee U“ ES KXKua 
— 3.72979 — 5.43373 ~ _ 
—1.32101 — 1.84277 — 1.45033 
—502.897* —729 . 366 1 
—0.02430 _ 
-+-0.07313* 


Since the magnitude of the x’s is probably za < Te < %, We choose the 
starred numbers in that order. If we desire four significant figures in the 
final results; we note that we must carry six figures in the calculations, 
since two figures disappear in one ofthe steps. The scheme is shown in 
Table 14. Then, 


a, = —0.02430/0.07313 = —0.3323 
te = —(—3.730 — 1.321 X 0.3323) /502.9 = 0.00829 
za = (5,535 + 0.3323 — 630.5 X 0.00829) /3.975 X 10° 


ll 


1.611 x 10° 


Substitution of these results in the original equations gives as a check, 
5.535, 9.116, 10.300. 

13.27. Evaluation of Determinants.—The procedure just outlined is 
also applicable to the evaluation of determinants, the scheme being similar 
to that shown in Table 14 except for the fact that the g’s are omitted. If 
the starred elements are taken in the first column and row, that is, in the 
order @1;, b11, C11, ete., the value of the determinant equals the product of 
all of these starred elements. If some other order is chosen. as in Example 
21, the determinant still equals this product but it must be multiplied by 
(—1)” where n is the number of interchanges required to.bring the starred 
elements into the position of the element which stands first in the corre- 
sponding array. If it is convenient to choose starred elements that are not 
in the first column the necessary modification of the procedure will be found 
described by Frazer, Duncan and Collar (loc.cit.). A method for determi- 
nants, with suitable checking procedures, is also given by Milne (loc. cit.). 
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Example 22. Evaluate the determinant of the coefficients of the v's 
in Example 21. Since two interchanges are required to bring the first 
starred element to the position a;; and one interchange to bring be1 to by1, 
the value of the determinant A is given by 


A = (—1)8 X (8.975 X 10°) X (—502.897) X (0.07313) 
1.4619 x 107 


Problem. Evaluate some of the determinants of Example 23, sec. 13.28. The 
answers are found in Table 16. 


ll 


13.28. Solution of Secular Determinants.—In many quantum mechani- 
cal problems, it is necessary to find one or more roots of a secular equation 
(see sec. 10.14): . 
yO) = | ay — ba | = 0 (13-65) 


Qi = O54, by = bja J BUA, N. In most cases, b;; = 6:;, but even 
if this is not true in the original form of the determinant it is usually possi- 
ble to reduce (65) to this form by suitable addition and subtraction of rows 
and columns. We shall assume here that \ occurs only in the diagonal ele- 
ments. The particular method to be used in finding values of A depends to 
some extent on the special problem at hand. We present three methods, 
each of which has certain advantages. 


a. The Polynomial Method. When (65) is expanded, it obviously 
gives a polynomial of the N-th degree in X. Once this polynomial is 
obtained, either of the methods of sec. 13.25 may be used to find values of X. 
Graeffe’s method is particularly useful when it is required to find all of 
them. To convert the determinant into the polynomial, its expansion 
may be effected by the usual method of reduction of its order (see sec. 
10.3) or by a very convenient procedure which has been described by 
Hicks.?? 

According to the latter method, we substitute X = 0, 1, 2,---, (N + 1) 
in the given determinant and evaluate each numerically. From these 
(N + 2) results, yo, Yi» Y2 +++; Ywaa, a table of differences is formed as 
described in sec. 13.2. An immediate check on the computation of the 
determinants is available for the (W + 1)-st differences should vanish. 
The polynomial is then given by 


x . 
yA) =E piv’ (13-66) 
where 
N 
Po = Yo M= È rubo; t21 (13-67) 
= 


22 Hicks, B. L., J. Chem. Phys. 8, 569 (1940). 
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The coefficients r: are independent of the values of the elements in (65) 
and may be computed from the following relations: 


hs = ga (13-68) 
els) = 1; cls +1) = csr (s) — seiles); cols) = 0 
where 
_ 
a(et1) = (Dt, 621; ea) = 59 
The results may be checked by the identities 
pe 21, $e = 0 (13-69) 
i=l s! i=i sl 
Values of the ri, through i = s = 6 are given in Table 15. 
TABLE 15 
~ 1 2 3 4 5 6 
3 
i 1 
2 | -% 2 
3 H -4 $ 
a |- ee: žr 
5 3 -ir OE -ir rie 
6 | -ê ELLI -3r trr -76 rhs 


Example 23. As an‘example of the use of this method, we choose the 
secular determinant whose expanded form served as an example for the 
Graeffe method (see Example 20). The determinant follows 


36 — A —4.062 0 0 
4.082 16-2 8216 0 B 
YX) =| 4 3916 4—-r 1449 | 7°? 
0 0 14.49 — 


Making the substitutions \ = 0, 1, 2, 3, 4, 5 in turn and evaluating the 
determinants, we obtain Table 16. The fact that the fifth differences 


TABLE 16 
a e a a a aa SOOO eo 
à y A A? A? At 
0 — 117,495 
1 — 105,948 4-11,457 
2 — 93,743 412,208 +658 
3 — 81,180 +12,563 +358 — 300 
4 — 68,535 412,645 + 82 — 276 +24 
5 — 56,060 +12,478 —179 — 252 +24 


ne SSO SOS a 
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vanish assures us that the determinants have been computed correctly. 
From (67), Tables 15 and 16, we find 


po = — 117,495 
pı = 11,547 — = _ 300 11,112 
658 300 11x 24 
=o —_— = 4 
Pe 9 + 9 + z 90 
300 24 
p=- — 7 = —86 
py = i 


hence the required polynomial is 
yA) = Af — Beas + 4900? + 11,112, — 117,495 
in agreement with the result given in Example 20. 


b. Matrix Method. A matrix method, described by Frazer, Duncan 
and Collar (loc. cit.) is sometimes useful. It gives the largest value of 
| à | only, but in quantum mechanical problems this is often all that is 
required. The method does not converge rapidly unless the largest root 
is widely separated from the remaining ones. The procedure is as follows. 
Set à = 0 in the secular determinant and multiply the resulting matrix by 
a matrix of one column. The latter is arbitrary but in its most convenient 
form it contains unity in one row and zeros in the other rows. Extract a 
constant scalar quantity from the resulting matrix product and multiply 
the original matrix with the new one-column matrix. Continue in the same 
way until the scalar quantity becomes constant. This is the required root 
of largest amplitude. 


Example 24. Find the largest root of the secular determinant of 
Example 23. The procedure is apparent from the following. 


36 4.062 0 0 1 36 1 
4.062 16 8.216 0 o| _ | 4.062 = 9g | 0-1128 
0 8.216 4 14.49 | | 0 0 f) 
0 0 14.49 0 0 0 0 

For the next approximation, 

36 4.062 0 0 1 36.46} 1 
4.062 16 8.216 0 0.1128) _| 5.87) _ 4, 4¢| 0.1610 
0 8.216 4 14.49! |0 0.938 | 0.0256 
0 0 14.49 0 0 0 0 
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Continuing in the same way, we obtain the results in Table 17, The sixth, 
seventh, eighth and ninth approximations give 36.85, 36.87, 36.88, 36.88, 
hence A = 36.88. Comparison with Example 20 shows that this result is 
uncertain in the last place. The convergence here is not very rapid since 
the next largest root is 22.341. More rapid convergence could be obtained 
by squaring the original matrix several times before commencing the matrix 
multiplications. The constant value so obtained is then some power of the 
desired root. Once having found the largest root, the next largest one may 
be obtained by the same method. Further details are given by Frazer, 
Duncan and Collar (loc. cit.). 


TABLE 17 


Successive Column Matrices 


Third Fourth Fifth 
86.65 1 36.76 1 36.81 1 
6.85 0.1869 7.87 0.2005 7.68 0. 2086 
1.42 0.0387 1.84 0.0500 2.07 0.0562 
0.37 0.0101 0.56 0.0152 0.72 0.0196 


c. Iteration Method. Several iteration methods which do not depend 
on matrix properties have been described.” Crude approximations to the 
roots of the polynomial are given by the diagonal terms in the secular deter- 
minant. Suppose one of these values, say Ao is substituted in the determi- 
nant for \ in every place except one where the quantity Ao — ` occurs. 
Now if the determinant is evaluated, the resultant value of ` is the next 
approximation to the true value. The process may be repeated as often 
ag necessary. 

Example 25. Find a root of the determinant of Example 23 by the 
iteration method. Taking Xo = 36, the determinant becomes 


36—-r 4.062 0 0 
4.062 —20 8.216 0 
0 8.216 —32 14.49 
0 0 14.49 —36 


Wher this is evaluated, we obtain ’A = 36.893. Substitution of 1) in the 
original determinant gives °\ = 36.896. The third approximation gives 
the same result. 


Problem. Compute some of the coefficients of Table 15. 


23 Yee, for example, James and Coolidge, J. Chem. Phys. 1, 825 (1933), Cross and 
Crawford, J. Chem. Phys. 5, 621 (1937). Another iterative method, which gives both 
the eigenvalues and the amplitudes for a system of honogeneous linear equations, has 
been described by W. Kohn, J. Chem. Phys. 17, 670 (1949). 
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PART 8. ERRORS AND LEAST SQUARES 


13.29. Errors.—Measurements are always accompanied by errors. 
They are of two kinds: determinate and random. Those of the first type” 
are often constant or systematic, being due to faulty or incorrectly adjusted 
instruments, mistakes on the part of the observer in reading a scale, record- 
ing a number or other similar effects. It is usually. possible to discover the 
causes of such errors and to make corrections for them. Random errors, 
on the other hand, are indeterminate and due to unknown causes, but they 
may be treated by statistical methods. As in the previous parts of this 
chapter, we shall often refer the reader to other sources? for proofs of 
theorems and results to be given here. 

Suppose several equally reliable measurements of a physical quantity 
yield the numbers X1, Xs, ---,X,. The corresponding errers are defined 
by , 

tı = Xi- X, t= X- X, © 2 =X, -X (13-70) 


where X is thé true value of the quantity. Actually, we seldom know”® 
the true value since any experiment made to determine it will be accom- 
panied by random errors, However, in order to proceed further we must 
choose some quantity which is called the most probable value. It will be 
indicated by X, the notation anticipating a fact that we prove in sec. 13.80, 
namely, the most probable value is the average of all the data. Since ¥ is 
not equal to X, the true value, we must distinguish between the error and 
the residual which is defined by 


d =X -X, d@=X,—X, -da =X, -F (13-71) 


It is assumed that the errors and residuals with which we are concerned 
are random ones. They are neither systematic nor constant but are equally 
likely to be positive or negative. Small errors are more frequent than large 
ones and very large errors do not occur at all. Under these conditions, 
the errors follow the laws of probability as given by the normal “ Gauss ” , 
distribution (see sec. 12.3) 


e ~44/2q2 


ue) = oN 2r 


%4 Errors of this kind are discussed in some detail by Crumpler, T. B. and Yoe, J.H., 
“ Chemical Computations and Errors,” John Wiley and Sons, New York, 1940. They 
may be detected in some cases by methods explained by Birge, R. T., Phys. Rev, 40, 207 
(1932). 

25 See references at end of chapter. 

2 An exception is the case where the quantity is exact by definition. For example, 
the true value of the atomic weight of oxygen is 16.0000 to as many decimal places as 
may be needed. 
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It is convenient for our purposes to change the notation, writing w(x) = N 
and h? = 1/207. The resulting equation 


b p-a 
Z e (13-72) 
gives us the relative number of measurements N having an error æ. The 
plot of N vs. x is called the Gauss error curve; it is shown in Fig. 1 for 
h = 1 and h = 0.6. From that curve or from eq. (72), we can discover 
the meaning of the constant h which is called the preciston index. When 
it is large, N is large for a given small error z and decreases as x increases. 


Fre. 13-1 


Thus a high precision index means that a large number of the measurements 
agree closely with the true value of the quantity observed. On the other 
hand, if A is small, a smaller fraction of the results are close to the true value 
and more large errors occur than in the previous case. 

The probability that the error of a single measurement will lie between, 
the limits +a is 


This integral occurs so often in mathematical physics that it has been given 
the special name of the error function. Itis usually denoted by 


erf (t) = + J, edy 
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‘Hence the probability in question is erf (ha). It cannot be evaluated in 
finite form but must be expanded in a power series and integrated term by 
term. Values of the integral as a function of ¢ are found in all books on 
probability.?” 

The special case where the limits of integration are + œ is of consider- 
able interest. The error must lie somewhere within this range, hence the 
probability must be unity, This is readily found to be true when the inte- 
gration is performed. 

The simplest way of evaluating the integral when the limits are + o, 
is the following. Let 


r-=| J eau) = N a 


Transforming to polar coordinates we get: dudv = rdrdo, u? + v? = 7? 


2r o 
P= f a f redr =x 
0 0 


Thus we see that the area under the whole curve (72) is unity. This, 
obviously, is the reason for the constant 2/ Vr. 
13.30. Principle of Least Squares.—Suppose n measurements have 


been made, the i-th one having the error z; The probability that z lies 
between x; and z; + dz; is 


J 


h . 
P; = Z e-malde, 13-73 
Var (13-73) 


The probability that the n errors z1, £o, «++, 2, oecur is the product of ù 
terms like (73), for each measurement is an independent event. Hence 
we have 


P=T[IP; = (4) e (ei+a3t e tea)dridts + edan (13-74) 
T. 


Clearly the differentials dx, dx, --- are arbitrary, for they may be inter- 
preted as the smallest subdivisions on a scale which is being read. Finally, 
remembering that h is fixed, we see that the probability P is a maximum 
when the exponent of e isa minimum; thus we have 


r? + eoo +a? = a minimum (13-75) 
as the criterion for the most probable value obtainable from n equally 


reliable measurements of a quantity. This result is known as the Principle 
of Least Squares. 


27 See also “ Tables of the Probability Functions P(x) and Erf (c),” Works Prog- 
ress Administration, New York City, 1941. 
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In accordance with that principle, let us determine the most probable 
value of the set of measurements X1, Xat °, Xn. Rewrite (75) inthe form 


(Xi — XY + (Xa — XP +--+ + (Xn — X? 


differentiate with respect to X and equate the derivative to zero in order to 
obtain a minimum. Since the result is to be the most probable value of X, 
we replace X by the symbol X to indicate that X is chosen to satisfy eq. 
(75). The answer is 
gal tket +X, (13-76) 
n 

As might be expected the most probable value is the artthmetic mean of all 
of the experimental results. It is interesting to note that the error law of 
eq. (72) is, within reasonable limits, the only form of equation which gives 
the average as the most probable value.” 

13.31. Errors and Residuals.—If we add n errors, we find, since 
t:i = Xx; — X 

EX: = nX + Er 

and from eq. (76) 


F= EX = X42 En (13-77) 
Also, we obtain for the first residual 
1 
dh =X, -Y = X, - X - 7 Esti 


"zD — la Lag — ses (13-78) 
n n 


=% — Lpr = 

n n 
with similar equations for the others. We thus conelude that as n increases, 
the second term on the right of (77) becomes smaller and X approaches the 
true value X. In the same way, we conclude from (78) that as n increases, 
the residuals approach the true errors. Actually, if we square n equations 


like (78) and add them, we get 
1 
Ed = Er- (Lax)? 


so that the sum of the squares of the residuals is slightly less than the sum 
of the squares of the errors. 

Suppose two independent quantities (M, and M2) have been measured 
and the errors in each case obey the normal law. Then the probability of 


28 A proof is given by Plummer, H. C., “ Probability and Frequency,” Macmillan 
Co., London, 1940, p. 123. 
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an error between a, and 2, + dz, in M; is 


Ay 32 
pı = ope ide, 


T 


while the probability of an error between x2 and z3 + dra in M. 2 is 


Since the observed quantities are independent of each other, the probability 
of the simultaneous occurrence of these errors in M, and M. 2 is 
P = Pipe 
Now suppose that M, and Ms are combined linearly to form a quantity 
M = aM I + agM 2 

where a; and a are constants. The error in M will lie between 

i artı + Aagte = £ (13-79) 
and 

altı + dx) + 2 (Xo 4- dzo) =g + dr 

We recognize the faet that such an error may be composed of any value of 
zı between + œ together with the corresponding value of x2 fixed by eq. 
(79). Thus to compute the probability of an error z in M’ we integrate 
pP = Pipe with respect to zı between the limits +œ and eliminate dzz by 


the relation dx = agdz which will be true when the integration has been 
performed over zı. Let us first rewrite pin terms of zı and x which gives 


—_ 2 
p=C exp| ~re? — a(n) eae 
az 
where C = hıha/r. With the further abbreviation 
' _ KR 
~ ah + ob? 


we also have A 
, Wend arra? 
p= 0 oxp | ~r - me (a — a) Jinan 


` Let N(x)dz be the required probability of an error in M between z and 
z + dz, then 


o 232 2 
N(w)de = Codey f exp| - BE (a, - a) Jan 
=æ 0 TAQ hi 


= ryt a ver] 
Ce at, [ave he 
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Since asdx2 = dx we see that 


N (z) = Í- ga 


T 
where 
hyhe 
= a + oil 013-20) 
or 
H kK k 


Thus the error law for M is the same as the error law for M, and Mz, the 
only difference being in the precision index. The equation is easily general- 
ized by the same method; in fact it may be shown that if 


M = aM: 
the precision index of M is given by 
1 a 


We would like to apply this result to the residuals. From (78), we may 
write 


di = e Tti > xj (13-78a,) 
where the prime on the summation sign means that the term 7 = 7 is 


omitted. The residuals are thus linear combinations of the errors, for di 
corresponds to M in the preceding discussion and 


(n— 1) 1 
a So ag Sg SS 
n ne 


The error law for the residuals is of the form of (72) or (80) 
H pme (13-808) 
and from (81) since } is the precision index for each x; 
1 if@-1,1 1]7_ 1 5 
ee ee ee) ee 


or 


H = J 7 (13-82) 
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From (82), it is seen that the precision index for the residuals depends on 
both n and A and is always larger than A. Reference to Fig. 1 shows that 
the curve of (80a) rises higher in the middle and falls off more rapidly than 
the curve of (72) but as the number of measurements increases the two 
graphs approach each other more closely. 

13.32. Measures of Precision.—Having obtained the most probable, 
value of a series of measurements, we need to find expressions for its relia- 
bility. In order to do this we must first eonsider the case where the true 
value X of the quantity is known. We may then proceed to the more 
practical question of expressing the uncertainty of X in terms of the residu- 
als. If the precision index were known it would be suitable for our measure 
of precision for as we have seen in sec. 13.29, erf (Az) is the probability that 
the error is within the range +z. However, h has the dimension of 
a reciprocal error and it proves more convenient to use as a precision 
measure a quantity which is inversely proportional to h, thus having the 
same dimension as the error itself. Three such measures are commonly 
employed; they are the average error (a), the root mean square error (m) 
and the probable error (r). 

The average error is the arithmetic mean of all the errors without regard 


to sign ; 
-Eleal 


n 


a 


(13-83) 


From its definition (see sec. 12.3), it follows that 


E a [7 phd = 2 - 
a J. | z |Ndz ZJ ade = = (13-84) 
Let us seek the most probable value of A. We recall that P of eq. (74) 
is the probability of the simultaneous occurrence of the errors 21, ta, - + +, 
tn Hence we must make P of that equation a maximum. Taking the 
logarithm of (74) we see that the most probable value of h is that quantity 
h’ which makes 


$ = nlog h — Ex? 
a maximum, or 


ah = h 2h Eri =0 
rf 
hence hk = di 
The quantity m defined by 
1 i 
m= = a (13-85) 
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is called the root mean square error. Comparison of (84) with (85) shows 
that 


a= mal? (13-86) 


T 


The root mean square error is frequently used in mathematical statistics; 
there it is called the standard deviation and indicated by o (see sec. 12.3, 
especially problem a). 

The probable error is defined as that error r such that one half of the 
errors of n observations are greater than r and one half are less than r. 
Thus it is given by the integral 


erf (hr) = 4 (13-87) 


for this says that there is an equal chance that a given error lies within + 
or outside these limits. From tables of the integral, we obtain ' 


0.4769363 -> - 
r= —— (13-88) 
h 
Combining this result with (85) we get for the probable error 
2 
r = 0.6745 J = 0.6745m (13-89) 


From eqs. (86) and (89) we can readily obtain all relations between 
a,mandr. They are 


r = 0.4769h7! = 0.6745m = 0.84530 
m = 0.707187) = 1,4826r = 1.25380 
a = 0.5642h = 0.7979m = 1.1829r 


The geometric significance of the three precision measures is also of interest. 
The average error a is the abscissa of the center of gravity of the area 
bounded by the error curve and the axes x and N of eq. (72). Ta see this, 
let zo be the center of gravity of that area, then 


fone 1 


to = Sa 
- [Na hr 


which follows from (84) since f Ndz = 1. 


The root mean square error is the radius of gyration of the same area 
about the N axis; it is also the abseissa of the point of inflection of the 


$ 
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error curve, as will now be shown. For the point of inflection, d?N /dz? = 0 
and from (72) 


dN — t 2h? —htx? , ae _ G 2 ) 
am N= Z” ; N =N z 2hr) = 0 
Thus, 
(1 — 2h?x”) = 0 
or 
c= +— = im 
h2 


From the definition of r, it follows that the abscissa x = r corresponds 
to the ordinate which bisects the area of the error curve (72) between 0 
and œ. 


ram P. Ca 
Fie. 13-2 


The relative sizes and positions of these three measures are shown in 
Fig. 2 where we draw only that half of (72) corresponding to positive 
values of x. It is perhaps not amiss to comment on the most appropriate 
measure to use. The average error recommends itself because of the ease 
with which it is computed. The probable error is less easy to calculate?® 


2° Convenient tables of 0,6745/n as a function of n and other quantities useful in 
the calculation of errors may be found in “ Handbook of Chemistry and Physics,” 
Chemical Rubber Publishing Co., Cleveland, Ohio. 
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but is perhaps more often used than the others in chemical and physical 
literature. As may be seen from Fig. 2 it is the smaller of the three and is . 
thus more flattering than a or m to a set of experimental data. There is 
little choice between the three measures on theoretical grounds. 

It is often of importance to find some estimate of the probable error of 
an adopted precision measure itself. The result has been obtained by 
Gauss®° who shows that the relative error of r is 

0.4769 


Vn 
With 10 measurements, it is seen that the probable error is uncertain by 
about 15 per cent while for even 500 measurements the uncertainty is 
2 per cent. It thus follows that it is seldom if ever of meaning to state 
the probable error with more than two significant figures, for usually one 
of these is uncertain. 

13.33. Precision Measures and Residuals.—From the equations of 
the previous section it is 4 simple matter to express the precision measures 
in terms of residuals. Suppose X1, Xe, ---, Xn aren observations. If they 
follow the error law, the residual d; is given by eq. (78a) and the index of 
precision of the residuals by eq. (82). Therefore, the average error 


a= J _ J n £l] di | _ £] di | (13-83a) 


hvr n(— i) n ~~ Vn(n — 1) 
Similarly, 
o1 [in Xa Ed; 
n-aren neen O 
and 


2 
r = 0.6745m = 0.6745 a (13-89a) 


The differences between eqs. (83), (85), (89) and (83a), (85a), (89a) 
should be carefully noted. In many cases, the deviations are used in 
place of the errors to get a from (83) rather than from the correct 
eq. (83a). The difference is negligible, of course, in most cases. 

The most probable value or arithmetic mean also follows the error law. 
Its index of precision is obtained from (81) where a; = 1 /n; hence 


1 


OE Se 2 1 
Fe = p EM) = FB, 


30 A derivation of it is given by Plummer, loc. cit. 
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Thus if a, m and r refer to the individual members of a set of n measure- 
ments, the corresponding precision measures relating to the arithmetic 
mean are 


It will be observed that the precision varies as the square root of n. There- 
fore comparatively little is gained by increasing n, for in order to change 
the precision by one decimal point n must be multiplied by 100. This is in 
accord with common sense which suggests that instead of making 100 
measurements it is more economical and reasonable to seek an improvement 
in the experimental method. A graph of r versus n is shown in Fig. 3. It | 
will be seen from that curve that it is seldom worthwhile to make more than 
10 measurements of a given quantity by the same method. 


0.6 
04 
r 
0.2 
15 20 
d 5 U y 
Fie. 13-3 


13.34. Experiments of Unequal Weight.—It often happens that the 
results of one experimenter are more reliable than those of another. This 
may be due to superior method or apparatus, to greater experience with the 
operations involved or to other reasons. Moreover, because of particu. 
larly favorable conditions, the same investigator may obtain better results 
at some times than at others. In all such cases, more weight is attached to 
some of the data than to the remainder of them. For example, if one result 
X, has a weight twice that of Xz, then the average X = (2X, + X2)/3. 
A result of weight w is thus equivalent to w results of unit weight, or we say 
that a result of large weight has a high precision index. 

If the j-th measurement is of weight w;, the weighted average or most 
probable value is 

x, — wx, 
Lu: 
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The probable error for the value of weight w; is 


d 
Do; = £0.6745 4 = 
(n — 1)w; 
and for the weighted average 
_ Lud; 
Py = +0.6745 m — Ew: 


It is possible to determine the relative weights to be attached to the 
individual measurements since the weight w; is inversely proportional to 
Pays The usual custom is to assign weights arbitrarily. 

18.36. Probable Error of a Function.—In general the results of several 
independently measured quantities are combined to give the final value of 
the physical constant desired. Suppose X, Y, --: have been obtained as 
the average value of certain quantities with probable errors P x, Pyes 
If they are combined to give Z, where 


Z= S(X,Y,- ` -) 
then its probable error is 


P =V (PxðZ/3X)}? + (PyðZ/3Y) +- 


We record a few special cases for convenience of reference. 


1. Z=X+Y; P= +V +r 

2. Z = XY; P = +V (XPy)? + (YPx)? 
LVPL Pe 

3. Z = X/Y; P= tg YPY + X PY 


4. Z =a +bX. Suppose we know the value Z, with its probable 
error p; at the point X = X, and Za with error pa at X = Xa. We wish to 
fit the two points to a linear equation. Then 


_ ( X ) ( Xip2 ) 
p= (ee t-A, 
—n 
_ PL p2 
n- aa tan) 


_ if pi(Xe - X) 2 (Be - XV 
Peo (BES + Vay = Xa 


where Pa, Ps and Pz are the probable errors in a, b and Z, respectively 
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13.36, Rejection of Observations.—Occasionally a single measurement 
from a set differs so widely from the others that the experimenter is tempted 
to discard it. A simple rule in such cases, based on statistical methods is 
the following: Calculate the average of all the data including the suspected 
measurement. Find each residual and calculate the probable error of a 
single determination. If any residual exceeds five times the probable error 
it may be rejected, the supposition being that the error cannot be a random 
- one. The reason for the use of this rule is as follows. Suppose the proba 
bility of an error as large as x; in the quantity measured is 0.001, then the 
chance that an error as large as x; will not occur is 0.999. Let us then 
determine the value of he for which erf (Av) = 0.999. From tables of this 
integral we find °! 


hr = 2.326 
Now from eq. (88) we have 
hr = 0.4769 
thus 
g = 49r 


We conclude that the probability of an error 5 times as great as the prob- 
able error of a single measurement is less than 1 in 1000 hence the somewhat 
dogmatic rule for rejecting such measurements. 

13.37. Empirical Forrulas.—As mentioned in sec. 13.1, there is con- 
siderable advantage in representing experimental data by means of equa- 
tions, the correct form of them being often suggested by theoretical con- 
siderations. In other cases, plots of various functions of the data may 
indicate a suitable form. When this question is settled, the next step is to 
determine the constants in the equation. Sometimes a graph may be used 
for this purpose, for if the equation is linear it is only necessary to deter- 
mine the slope and intercept of the curve. In more exact work, numerical 
methods are needed. 


a. The Method of Averages. Suppose that the quantity y has been 
observed as a function of another quantity z, the resulting numbers being 
Yi; Y2, *'', Yne It has been decided that a polynomial of the m-th degree, 
m < nis a suitable equation 


y=A+Br+Cz?+--- (18-90) 


Divide the measurements into groups equal in number to the unknown 
constants, placing an equal number of results in each group if this is 
possible. Add the equations in each group thus obtaining a set of simul- 
taneous equations equal in number to the number of unknowns. The 
equations may be solved by the methods of sec. 13.26. 


31 See, for example, the reference in footnote 29. 
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It will be found in general that this procedure is quite satisfactory. 
The resulting constants are different for different groupings of the data, but 
the simplest such grouping is usually better than any other. If there are a 
large number of results or if the polynomial is of degree higher than four or 
five this method is nearly as good as the method of least squares and entails 
considerably less calculation. 


b. The Method of Least Squares. Suppose as before that n values are 
available for y but that the chosen equation is of a more general form than 
(96), 

y= S@A,B,C,- .) (13-91) 


If there are n constants we may obviously fit the data exactly to such an 
equation but usually there will only be m < n constants, Thus the calcu- 
lated value of y will not agree with the observed one. Let 


y: = yz (cale.) + d; 


where y: is an observed y and y; (calc.) is the corresponding calculated one 
using the constants finally adopted. In accordance with the principle of 
least squares we wish to make 


Sd? = a minimum (18-92) 
Let us now assume that we have found approximate values of the con- 
stants by graphical means or otherwise so that 
Á = åo ta; B= Bi +b; C=Cot+e; 
a, b, ¢, + ++ being small correction terms. Then the i-th equation of (91) 
LAA,B,C, e) = yi - di 
may be written as 


fi He ey, ~ a - 
filAo Bo Cor) +4 aAo +6 BB, +e 3C, + yi— d; (13-93) 
where we have discarded derivatives of second and higher order. Using 
the abbreviations 


Ba Hy, Moy 
@4g.” aBa | a% ” 
and 
yi — JilAoBo Cor ++) = Fe 
(93) becomes 


wa +b + wets Fit di = 0 
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where Us Vi Wi Fi are known and a, b, c, d; are unknown. Since we wish 
(92) to hold we must require that 


n. 


Zo lua + vb + wet — Fi? = (a,b,c) 


i=] 


be a minimum or that 


T L Eua + vib + wet: — Fiju = 0 
08 a b F,)u; = 0 
z7 Qe (uia + vb + wie t +e — Fiw = (13-94) 
SE = IE (usa + ob + wee H Pijus = 0 


These equations (when divided by two) are called the normal equations. 
There will be as many of them as there are unknowns. 

In' many cases, the chosen relation between z and y is a polynomial, 
when some simplification in the procedure is possible. The original equa- 
tions corresponding to (91) will be of the form 


A+ Ba; + Caf +--+. = y; (13-95) 


It is still worthwhile to use approximate values of the constants for then the 
normal equations will be easier to handle. If this is done (95) becomes 


a + br; +e. = F; (13-952) 


In either case, the normal equations may be written down without differen- 
tiation. They are found as follows: (1) multiply each equation of (95) 
or (95a) by the coefficient of the first unknown (unity since we are speaking 
of A or a) and add the resulting n equations; (2) multiply each equation 
by the coefficient of the next unknown (x;) and add these equations; (3) 
continue in the same way until each equation has been multiplied by the 
coefficient of each unknown. The resulting normal equations which are 
identical with those obtained by the procedure leading to eq. (94) may then 
be solved by the methods of sec. 13.26 to obtain the constants. The final 
equation should always be checked by using it to compute each known ysi. 
The sum of the squares of the residuals should be small and the algebraic 
sum of the residuals themselves should be nearly zero.3? 

Such a procedure will show how closely the curve fits the known points 
but says nothing about the reliability of the curve at other places. In the 

32 Further details of the method of least squares are given by Brunt, D., “ The 
Combination of Observations,” Cambridge Press, 1917. He describes several schemes 


for checking the calculations and evaluating the constants with their probable errors. 
See also, Birge, R. T., Revs. Mod. Phys. 19, 298 (1947). 
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important case of a linear equation, y = a + bx the formulas?? are com- 
paratively simple. The probable errors in a and b are 


n JE, noni 
Py = re 7 Py = Te D 


_ Sey oe na? (Er)? 
r = 0.67454] im 2)! D = nda; — (Èr) 


The error in y at any point z (x not necessarily a measured value) is 


Pa =r, Eeo 
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CHAPTER 14 
LINEAR INTEGRAL EQUATIONS 


14.1. Definitions and Terminclogy.—An integral equation is one which 
contains the unknown function behind the integral sign. Its importance 
for physical problems lies in the fact that most differential equations together 
with their boundary conditions may be reformulated to give a single integral 
equation. If the latter can be solved, the mathematical difficulties are 
not appreciably greater even when the number of independent variables 
is increased, while differential equations, such as Laplace’s, are considerably 
more complex in three dimensions than in two. The theory of integral 
equations also furnishes a uniform method for the study of the eigenvalue 
problems of mathematical physics. 

A linear integral equation of the third kind, the most general type con- 
sidered, has the form 


b . 
(eela) = ffe) +r f Kero) (14-1) 


The known functions are g(x), f(x) and K (xz), the latter being called the 
kernel or nucleus. The limits of integration a and b are either known func- 
tions of x or constants; à is an absolute constant or a parameter. It is 
desired to find the unknown ¢ as a function of the independent variable v. 
Four special cases of (1) have been most widely studied. In Fredholm’s 
equation of the first kind, g(x) = 0, and in his equation of the second kind, 
g(x) = 1; in both cases a and b are constants. Volterra’s equaitons of the 
first and second kind are like Fredholm’s equations except that a = 0, and 
b =x. If f(x) = 0 in either case, the equation is said to he homogeneous. 
When one or both limits become infinite or when the kernel becomes infinite 
at one or more points within the range a to b, the equation is called singular. 
Non-linear integral equations may occur in the form 


b 
oG) = fe) +A f Keaz)9"(e)ae 
or 


b 
o(2) = fle) +A f Finzo@)lae 
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We limit! our discussion here to linear equations in one variable where the 
unknown ¢ enters only to the first power. Our plan is to present first the 
purely formal mathematical methods of solution. We then show how to 
convert differential equations into integral equations and apply the theory 
to certain physical problems. 


GENERAL METHODS OF SOLVING INTEGRAL EQUATIONS 


14.2. The Liouville-Neumann Series.—a. Fredholm’s Equation of the 
Second Kind. Suppose the given integral equation is 


b 
olz) = fæ) +r] K@z)o(z)dz (4-2) 


where z and z are real variables with a < x < b, a < z < b; K(z,) and 
f(x) are continuous but may be complex. We attempt to solve (2) by 
means of a power series in X: 


g(a) = È Menla) (14-8) 


Substituting (3) into (2) and equating coefficients of equal powers of A we 
obtain 


gol) = f(z) 
g(a) = f Kaede 


talz) = f K(x2)o1 de (14-4) 


én(t) = f K (2,2) na (2)@2 


Remembering that both z and z are restricted to lie between a and b, we see 
that the kernel and f(z) must have mazimum values, for we assumed them 
to be continuous. Let these maxima be given by | K(z,z) | < M,|f(z)| 
<N. Then it follows that 


ldo] EN, [a] S NMO- ah |n| S NIMO - a) 


1 References to more complete accounts of the subject will be found at the end of this 
chapter. Integral equations are frequently encountered in current physical and chemi- 
cal literature, indicating that they are powerful tools for handling a variety of problems. 
Many examples of such usage are given by Morse, P. M., and Feshbach, H., ‘ Methods 
of Theoretical Physics,” McGraw-Hill Book Co., Inc., New York, 1953. 
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If 
AS 
M (b — a) 


the series (3) which is called the Liouville-Neumann series converges uni- 
formly and is the unique continuous? solution of (2) within the range 
axx<b. 

In order to obtain the solution in more convenient form, we define the 
iterated kernels’ 


Ki (@,z) = K@,z) 

Kalz) = f KG) Kady 

ereenn ern a46) 
Kalz) = f K0) Kaalu edy 


= f f ve S K(z)K (hya) +++ K (ya2)dydya «+> dyna 


Introducing these functions into (4) we may write 


Ja] < (14-5) 


tie) = f K@e\fle)de 
al) = J Ke(waifle)de (14-7) 


Ce er rs 


ba(t) = f Kp (2,2)f(e)dz 


By the same means as before we see that | K,(z,z) | < M” (b — a)""}; 
hence if (5) is fulfilled we can construct a uniformly convergent series 
called the resolvent (lösender Kern). 


K(z, A) = x Ky (%,2) (14-8) 


From (3), (6) and (8), it follows that the solution of the integral equation 
is 


ole) = Sæ) +A f Koznyfede (14-9) 


2 Continuous solutions of the equation may exist even if (5) is not true. There may 
algo be discontinuous solutions. For these exceptions, see Lovitt, loc. cit., pp. 13 and 21. 

3 Henceforth, we usually omit limits of integration unless they are different from 
a and b. 
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The resolvent and ¢(z) have properties of a reciprocal nature as may 
be seen by comparing (2) and (9). If (s) is the unknown, (9) is the solu- 


tion: if f(c) in (9) is the unknown, (2) is the solution. These properties 
are even more apparent if we rewrite (8) in the form 


Koa) ~ Kee) = È WKnsalea) = EM f Kew) Kees ved 
or 
(wen) — Kee) =f Ktew)K uad (14-10) 
Similarly, we may obtain 
Keen) — Kez) = f Kew Koeddy 


b. Volterra’s Equation of the Second Kind. Application of the Liouville- 
Neumann series may also be made in this case. Suppose 


b) = f(z) +A f "Hr (@,2) 6 (2)d2 (14-11) 
0 
is given. Then if 
Kea} - a a); 0< z $ z 


we may write an equation similar to (7) for z <2 
n(x) = f Knead 
and also an equation like (6) 


Ky(z,2) = f Ken K e)dy 


= f Kays )dyi f K (41,42) Kn—3(yae) aye 


The solution of Volterra’s equation obtained in this way converges for all 
values of à. 


c. Yolterra’s Equation of the First Kind. Under certain conditions, 
Volterra’s equation of the first kind may also be solved by the Liouville- 
Neumann series. With a change of notation, we write this equation as 


ga) =a f ” K (x2) 6(e)de (14-12) 
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Differentiation with respect to z results in 
E OK 
æ) =a S E bede + 1K @x)6(2) 


which is similar to (11) provided K (z,x) = 0 and 


aK 
g. _ _ oe 
Me) = See ay) Mea) = XK (2,2) 


A similar conversion of (12) to an equation of the second kind may be made 
by partial integration. 

When K(z,z) vanishes, the procedure just described gives an equation 
of the first kind again. Let us consider the situation in more detail, 
assuming that the kernel is a polynomial of n-th degree in x and that the 
coefficients of the terms in z are polynomials in z, but not necessarily of 
n-th degree. It is convenient to express the kernel as a polynomial in 
£ = (x — 2), so that it may be written 


K(%,2) = ag(z) + ay (DE + +++ + ang” 


Two special cases are of interest: (1) ao(z) = 0; (2) ao(z) contains no 
constant term, 

In the first case, K(x,2) vanishes identically; but if the derivative of 
the kernel does not vanish, which means that a, + 0, two differentiations 
of eq. (12) will yield an equation of the second kind and a solution is again 
possible by this method. Further differentiations could be carried out if 
necessary. Several partial integrations could replace the differentiations, 
if this were preferred. 

In the second case, the kernel vanishes only for z = z = 0. However, 
the integral equation may then be converted into a differential equation. 
With the same polynomial kernel, differentiate eq. (12) (n + 1)-times, 
The integral on the right will vanish, the differential equation remaining 
is of order n, and its solution, adjusted to fit the Appropriate boundary 
conditions, is the solution of the integral equation. 

An explicit form for the solution can be given, but itis quite awkward in 
the general case. Note also that the presence or absence of.a constant 
term in ao(z) is of no consequence. For illustrative purposes, let us take 
a simpler expression for the kernel. Suppose the polynomial is only of 
second degree and that ag(z) = Ao + Aiz + A22; ay(z) = By + By; 
a2(z) = Co, where As, B;, C; are constants. Three differentiations of (12) 
will give 

g” (€) = Agt?g'” (1) + (By + 442)x9" (2) + (242 + Br + 2Cy)4(2) 
whioh is a differential equation of the Euler type. Introduction of a new 
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variable, u = In z, will reduce it to a linear, inhomogeneous equation with 
constant coefficients, as discussed in sec. 2,8. Its form, with proper change 
in notation, is identical with eq. (2-19), and its solution is eq. (2-22). 

An important case arises when the kernel becomes infinite at one or more 
points within the range of x and z. It is then necessary to transform the 
equation to remove the singularity. As a typical example, consider the 
kernel 


K(x) = O<a<l 


(æ — 2)’ 
which is infinite when x =z. Substitute this kernel in (12), multiply both 
sides of the equation by da/(u — x)'~* and integrate with respect to x 
from Otou. If for simplicity we also take à = 1 the result is 


u glzjde  ™ dz- z d(z)dz 
f (u — z)” E f (u — SES (x — z)“ 


= frof Gu Ea — z)” 


The justification of the change of limits and order of integration in the last 
equation is the following. Since x varies from 0 to u, and for every value of 
z, the variable z goes from 0 to z, the situation is equivalent to the varia- 
tion of z from 0 to u and the variation of z from 2 to u for every value of z. 
The same result is also easily obtained from a figure. If we are integrating 
F (2,2) over the shaded area of Fig. 1 we see that 


f "de f ° Plæg)dz = f “de f  F (ayz)de 
(8) 0 0 2 


The definite integral f F(z g)dz = f (u — z) (g — 2) “de may be 


evaluated as follows. Introduce the new variable y = (u — 2)/(u — 2) 
which shifts the limits to 0 and 1, respectively- The result is an Eulerian 
integral of the first kind* or B-function which is simply related to the 
T-function. Explicitly, the result using (3-12) is 


Bla, 1 ~a) = T(@) PA — a) = r/sin ar. 


The solution of the integral equation is thus 


olu) = snot of [owe - 2) as | 


Equations with singular kernel, especially those where the singularity 
results from an infinite limit of integration, may usually be solved by 
integral transforms. In fact, the transforms of Fourier, Laplace, Hankel, 


4 See sec. 3.2. 


14.3 LINEAR INTEGRAL EQUATIONS 526 


and Mellin are special cases of integra! equations of the first kind. They 
have been discussed at length by Morse and Feshbach, loc. cit. 


Ed 
Problem. Solve the equation o(z) =e + f (2 — x)e(zjdz by the Liouville- 
0 
Neumann series. Hint: substitute (z — z) = uj (y — z) =v. Ans.: ọ (£) = sin z. 


Z, 


m x 
Fia. 14-1 


14.3 Fredholm’s Method of Solution—a. The Inhomogeneous Equa- 
tion. Fredholm studied the solution of a system of linear equations in 
n variables and observed that as n becomes infinite the results are appli- 
cable to linear integral equations. Although the reasoning is simple, the 
derivation of the final formulas requires considerable space. We therefore 
show only how the method may be tsed, referring the reader to other 
sources? for the intermediate steps and proofs. 

The unique and continuous solution of (2) is of the form (9), where the 
resolvent is the ratio of two infinite series in à. In fact 


Kizza) = 2 on ~ (14-13) 

where 
D(z) = K(z,2) + rE Dr (a,2)X* (14-14) 
DA) = p E D,” (14-15) 


The coefficients D, and the functions D,(z,z) may be found from the 
following recurrence relations. Starting with K(z,z) = Do(zx,z) we obtain 


5 See references at end of the chapter. 
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D, from the integral 


Da = f Dy esta 0 (0416) 


We then find D, (z,z) from 
Dm (2,2) = K(«,2)Dm _ mf K(2,y) Dm—-1 (y,2)dy (14-17) 


which enables us to determine Dz from (16). Continuing in this way, all 
of the coefficients are calculated. In many cases, depending on the explicit 
form of the kernel, the series (14) and (15) contain only a finite number of 
terms. 

One distinct advantage of the Fredholm method is that (13) is uniformly 
convergent for all values of à unless D(A) = 0. If that happens, the 
procedure which we have described is inapplicable since the resolvent 
vanishes. Actually, there is then no solution unless certain other condi- 
tions are met. We omit the necessary extension of the Fredholm theory 
but return to the problem in sec. 14.4b. 


b. The Homogeneous Equation. H f(x) = 0, so that the given equation 
is homogeneous, 


b(t) => f K (2,2)¢ (2)dz (14-18) 


Then cursory inspection of the solution (9) leads to the conclusion that 
(z) = 0. This is generally the case but we shall see that when the pa- 
rameter \ assumes certain special values we are led to a situation similar 
to the eigenvalue and eigenfunction problem deseribed in Chapter 8. If 
DA) = 0 and D(z) # 0, eq. (18) indicates that K (x,z;) approaches 
infinity and we may still find non-vanishing solutions of (18). Equating 
the right side of (15) to zero, we have a polynomial in à with n roots, multi- 
ple or distinct. They are the eigenvalues of the kernel, and the correspond- 
ing solutions of (18) are the eigenfunctions. Assuming that all eigenvalues 
are distinct, choose one of them, say A, substitute (13) in (10) and multiply 
by D(x), which gives 


DEEN) =N | KeDuerody (14-19) 

Tf we compare this equation with (18), we observe that D(z,z;,), for any 
constant z, is a solution of the homogeneous equation, i.e., 

dilz) = Dmc) (14-20) 


Having found a solution for dz, we proceed to find the others for the re- 
maining eigenvalues in the same way. Linear combinations of them form 
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the general solution 


$l) = E Cnbm (2) (14-21) 


where the Cm are arbitrary constants. 

Tt is true that D(a,z;) may vanish identically in x and z or vanish 
because of an unfortunate choice of the constant value of z. In the former 
case, non-trivial solutions may often be found by more complicated meth- 
ods; in the latter case, we simply choose another z * c. When the eigen- 
values are degenerate further modifications of the method are required. 


Problem a. Solve by the Fredholm method: 
1 
plz) =£ +A f (z + 2)o(2)de 


, _ 6z(A — 2) — 4a 
Ans: o@) = 52579) — 12 
DO f 
Problem b. Show that [re adz = — aN Hint: use eq. (16). 
Problem c. Set f(z) = 0 in the equation of Problem a and solve. Hint: show 
that D(z,c;d) = (2/e)(2 — e) (em +A} (e + 1); A = 2(2 — e); e= + v3. 
Ans: ¢a(£) = Call + V32). 


14.4. The Schmidt-Hilbert Method of Solution—In many physical 
problems, the kernel has the property of being symmetric, i.e, K (2,2) 
=K(z,c). In such cases,® the integral equation may be solvéd by a method 
which is somewhat different from any of those in the preceding sections. 
We find it convenient to limit the discussion to kernels which are real as 
well as symmetric. 


a. The Homogeneous Equation. A real symmetric kernel has at least one 
eigenvalue and it may have an infinite number. We omit the proof of 
these facts. 

The eigenfunctions of the homogeneous equation (18) are mutually orthog- 
onal. Suppose M and à; are two different eigenvalues corresponding 
respectively to eigenfunctions ¢; and ¢;. Then we may write 


ile) = Ni f Klea 


6i(0) =>; f Kehed 


,  Unsymmetrie kernels may often be symmetrized; see sec. 14.7 or Courant-Hilbert, 
ot cit. 
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Multiply the first equation by ¢; and the second by ¢,, then integrate over z 
J bi(a)oj(n)de = Xi f K (2,23%: (2)ġ;(2)dzdz 


=r) [Kae)oia)aj(ededs (14-22) 


The last integral may be written as A; f K (2,2)¢;(2)¢;(2)dedx by inter- 
changing z and z. Thus if K(x,z) = K(z,x), the two integrals of (22) are 
identical and since A; = Ay, it follows that 


fewa = 0 (14-23) 


As we know from Chapter 8 such functions may always be normalized. 
Henceforth, we will assume that this has been done and will indicate the 
orthonormal solutions of (18) by ®,(z), se that 


f B; (x) Bj(w)de = ôi (14-24) 


The eigenvalues of a real, symmetric kernel are all real. Suppose the 
solution of the homogeneous equation (18) were of the form ¢(z) = 4; (x) 
+ tdo(z) and one of its eigenvalues were also complex, \ = a+ 78. We 
could then take the complex conjugate of (18) 


g* (e) = »* f Klee ede 
But according to (23) 
O 1) foede =0 


or 
2g f (62 + 2dr = 0 


which means that 8 = 0 and the eigenvalues must all be real, 
Arbitrary functions of x, including the kernel for fixed z,. may be ex- 
panded in terms of the eigenfunctions 


K(ez) =X C:%,(e) (14-25) 


The functions ®;(z) form a complete set as explained in Chapter 8 As 
also shown there, the coefficients of (25) may be found by integrating that 
equation term by term. Thus, using (24), 


C; = EOI 
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But 
Dile) =r; f K (2,0) ®:(e)de = N f K (x,z)®,(x)de 


since the kernel is symmetric, hence 


o- 20 
A 
and (25) becomes 
K(@w,z) =£ Be (14-26) 


b. Solution of the Inhomogeneous Equation. We are now ready to con- 
sider the inhomogeneous equation (2); for that purpose we assume that 
we have found the eigenfunctions of the homogeneous equation by the 
method of sec. 14.3b. Let them be @;(z). Then we may write 

olz) — f(t) =E a; (x) 
ai= f [6(e) = fe@)lo(a)de 


where ¢(r) and f(z) both come from (2). Now substitute (27) in (2) to 
give 


(14-27) 


Eavb(e) = > | Kafede +E a f K(ae)bi(e)de (14-28) 
We may also expand f(z): 
f(z) SE b@a); Bi = f F(x) ®(x)da (14-29) 
and obtain by using (26) and (24) 
[GE setae = fE Oy B;®;(@)dz 


_ gs idi) 
-=£ —— 


with a similar expression for the last integral of (28). That equation 
becomes 


>> aiil) = AL = l) HAE S , (2) (14-30) 
Because of the independence of the functions #,, the coefficients of each 
may be equated on both sides of this equation. Hence, 


=| t 
a= [E+E] 
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or if A £ Ns 


= biro (14-31) 


MÀ 


This method, whick was devised by Schmidt and Hilbert, thus gives a solu- 
tion for A = As for we may substitute e and (31) into (27) and obtain 


$) = f(e) + EX x Pe) 


=f@) + rAd fa f resad] (14-32) 


As we have noted before, the homogeneous equation for \ = M; has the 
solution ¢(z) = 0 since f(z) = 0. 

We must still consider the exceptional case when A is one of the eigen- 
values of the kernel. Suppose, for example, that A = Ao is an m-fold 
degenerate eigenvalue, i.e., Xo = Az, Az; ** ts Am. Then (2) reads 


ole) = f@) +o f Kedak 


and by the preceding method we obtain 


_ Br 
aS 
i — Ao 
where i is not one of the numbers 1, 2, ---, m. When equals one of these 
integers we have, if æ; is to remain finite, 
Bi = Bp = ++: = Bm =0 
which in turn requires that 
= EOLA =0; j=1,2,---,m (14-33) 


Thus if \ is an m-fold degenerate eigenvalue, the inhomogeneous equation 
has solutions only if f(x) is orthogonal to the corresponding eigenfunc- 
tions 4;(z). The general solution of the equation is then 


&; l 
se) =f) HNE |E fr@ame|+ cree) + 
+ CmPm(2) (14-34) 
where the prime on the summation sign means that the terms? = 1, 2, +++, m 
are to be omitted from the sum. 


Problem. Find the solution of the equation of Problem a, sec. 14.3, by the Hilbert- 
Schmidt method for à not equal to an eigenvalue. Show that there are no solutions 
when } is an eigenvalue. 
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14.5. Summary of Methods of Solution.—a. The Homogeneous Equa- 
tion. 


1. D(A) # 0. No solution except (x) = 0. 

2. D(A) = 0; D(z,2;4) ¥ 0. Solution is given by (20) and (21), 
The resulting eigenfunctions are orthogonal and may be normalized. To 
each solution belongs an eigenvalue. 

b. The Non-homogeneous Equation. . 

A DG 
1, Solution given by (9) provided (5) holds. 


2. For all values of à = M; solution is given by (9) and (18). 
3. If K(z,z) = K(z,x), solution is (32). 


N= 


4. K(z,z) = K(z,r); solution is (34). Special methods have been 
given for Volterra’s equations of the first and second kinds. 


USE OF INTEGRAL EQUATIONS 


14.6. Relation between Differential and Integral Equations.—We have 
shown in the previous sections how integral equations of the more common 
types may be solved. We now propose to study the relation between 
differential and integral equations so that we may state physical problems 
in either form at will. For this purpose consider as a simple example the 
second order differential equation 


y” = flay) (14-35) 
Integration results in 


y! (a) = J flayle)}de + C, 
A (14-36) 
vie) = f | S Jizva |as + Cre +0, 


An alternative form of the last expression’ is 


y(t) = f (æ — z)flz,y(z)}dz + g(x) 
0 
glz) = Ciz + Cz 


which is recognized as a non-linear Volterra equation of the second kind 
with y(z) as the unknown. 


(14-37) 


7 To show that the two equations for y(x) are identical, differentiate the last one 
with respect to x; the rerult is (36). 
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The boundary conditions which are needed to determine the two 
integration constants, C; and C2, may be either of two types: (a) y and y’ 
are fixed at one point within the range of integration, say at z = 0; (b) y is 
fixed at two points. The first case is simple, for if y(0) = a, y’ (0) = b, 
(37) becomes 


y) = f (2 — z)f{z,y(2)}de + bz + a 


The second case leads to greater difficulties. Suppose y(0) = a,y(1) = b; 
then Co = a, as before. For xz = 1, we have 


1 
b= ya) = f (1 — afd + Cita 


or 
1 
C= o-a- f (1 — 2\fdz 


where we abbreviate ff z,y(z)} by the single symbol f. Substituting the 
values of C4 and C2 into (87) we obtain 


z 1 
ya) =r + f ede f -Djà 
z z 1 
= h(x) + f (2 — z)fdz + sf (z — 1)fdz +0 f (z — 1)fdz 
0 z 


z 1 
= h(t) + Í ele — 1)fdz + f ale — fae (14-38) 


where A(z) =a + (b — a)z. We thus see that in this case, if we are 
willing to divide the range of x into two parts with a different kernel for 
each part, 

=2a(t~-1) 2z 
Kees) { =a(z¢—1) «<2 


eq. (38) becomes an integral equation of the Fredholm type 


1 
ylz) = h(x) + J K(xa)flau(e)}de 


Problem. Convert the following differential equation and ita boundary conditions 
to an integral equation. 


y! +y=0; yO) =y0) =0; y0) =1 


Am.: yœ) =s + Í (z — z)y(z}dz 
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14.7, Green’s Function.—Our problem now is‘to find a general method 
of constructing such kernels. For this purpose we consider the inhomo- 
geneous Sturm-Liouville equation 


L(u) = (pw) — qu = —4(2) (14-39) 


the homogeneous form of which has been discussed in Chapter 8. We 
will later prove that a certain function G(z,z) called Green’s function is the 
kernel of a homogeneous integral equation which is equivalent to (39) 
and its boundary conditions. At the moment we study the means of 
finding Green’s function. For reasons which will presently be clear, it is 
defined to have the following properties: 


a. For fixed z, it is a continuous function of x and satisfies all of the 
boundary conditions to be imposed on u. 


b. Both G’ and G” are continuous at every point within the range 
of x except at x = z, where it is discontinuous? so that 


G'(z2 + 0) — Ge — 0) = —1/p(z) (14-40) 


c. Except at « = z, G(z,z) satisfies the differential equation L(G) = 0. 
We now proceed to find such a function G. Suppose two linearly inde- 
pendent solutions of i 

Lu) = 0 (14-41) 


are known. If these are u(x) and wo(z) their independence may be 
recognized by the fact that the Wronskian, urug — ulug = 0 (see sec. 
3.13), and the general solution of (41) is 


u(r) = Ciu + Coug 


Let us divide the range of x into two portions; a <r <z z<zr<b, 
and write ‘ 


u= { ur = (A ~ aju (z) + (B—B)us(e); z <z 


urr = (A +a)ur(z) + (B + BJusle); 2 >z (14-42) 


I 


where A, a, B, 8 are constants to be so chosen that u, which will later be 
taken as our Green function, satisfies conditions a,b and c. If we im- 
pose on this function the requirements a and b, we must have 


_ Ur) = wyz7(z) 
ur r(2) — ur(2) = —1/p(e) 


® The notation G'(z + 0) means that G’ is evaluated at the discontinuity when it 
ia approached from values of z > z while G” (z — 0) is evaluated when the discontinuity 
is approached in the opposite direction. It is necessary to make this distinction in 
order that the magnitude of the discontinuity will be determined with respect to sign. 


(14-43) 
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or, because of (42) 

aur (z) + Bug(z) = 0 

aru} (2) + Buz(2) = —1/2p(2) 
Solving these equations for a and 8, we obtain 


1 ug B 1 mM 
7 7 = 5 7 7 
2p uzl — Uug” 2p Uyo — Uus 


and hence 
ulz) = f(z) + Au (r) + Buz(z) - (14-44) 
where 
_ 1 uy (uale) — Ug (z)uy (x) 
Hee) = = A È (in (2) — uf (eJuz(2) 


Here and in the remainder of this chapter, when two equations are given 
or when there is a choice of sign, the first always refers to x < z and the 
second to z > z. The two constants A and B of (44) are determined so 
that u(x) satisfies the boundary conditions of the problem. The resulting 
function, which we henceforth indicate by G(z,z), is Green's function. 

We now prove that if (x) is a continuous function of z, then the func- 
tion which will satisfy the differential equation (39) is given by 


b 
u(r) = f G (2,2) (z)dz (44-45) 
Differentiation of (45) with respect to x gives 


b 
we) = f enred 


Éd b l 
we = f geod | aoad 
+ G’ (zz — 0)e (£) — G' (ez + 0)d() 


b 
= f G' (az) (2)de + [G’e + 0,2) — 6’ (£ — O2)]6(@) 
= [renser - 22 


p(x) 
Therefore 


a 
pu’ + plu! — qu = Lu) = f (p6" + 2/6’ ~ gGhocehae — ate) 
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Requirement c causes the first term on the right to vanish. Hence, we 
have established (39) and completed the proof that G (z,2), calculated as 
described, is the kernel of (45), and that the latter is equivalent to (39) 
and its boundary conditions, 

An important consequence of the properties of Green’s function is 
that it is symmetric. The proof proceeds as follows: Let us integrate 
the identity 


voL(u) — ub(v) = a [p(vu’ — uv’) 
dx 
This results in a relation known as Green’s formula: 
b 
È= f bL) ~ ubode = [pow -wE = st (14-46) 


Now let G(x,z1) = v; G(z,22) = u; and consider -the three ranges a < z 
St} a St Lz; 22<2<b. Evaluate the integral, dividing it into 
three parts a, z1 — 5; zı + ô, ze — 8; 2g + ô, b, where ô is a small increment 
which will approach zero in the limit. 

We thus may write 


Ta = SE + S2 H Laa 

= 8 - Set} — sen (14-47) 
According to (46) 72 = S> and both must be zero, because from e, L (2x) 
= Lv) = 0. This in turn requires that 7? = 0 and SÈ = 0 since other- 
wise Green’s function will not satisfy the boundary conditions. If in (47) 
we let ô — 0 and use (46) we obtain 
0 = ~pe) (a1) —0" ex +0)u(e1)1— [eu a) (ey ~ Ore ex) 

— p(z2){ [v (z2)u' (22-+0) —v’ (zg) u (22)]—[v()u’ (22~0) —v" (za) u (z0)]} 

In writing these equations it must be remembered that u and v are continu- 


ous for the whole range while u” is discontinuous only at zz and v’ only at zı , 
so that for example u’(z; + 0) = u’ (21). Finally from (40) we obtain 


u(zi) = v(22) 
or 


G(z1,22) = G (22,21) 
Since the points z and ze are arbitrary we write in general 
G(z,z) = G(z,r) 


The symmetry of Green’s function is of considerable importance, since it 
permits application of the Hilbert-Schmidt theory. 
It frequently happens that the two constants A and B of (44) cannot 
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be adjusted to satisfy the given boundary conditions. In this case, a 
modified Green function® can be found in the following way. Suppose uo (zr) 
is a solution of (41) that satisfies both boundary conditions. Then cug(zx) 
will also satisfy the conditions. No loss of generality occurs if we deter- 
mine the constant so that wo(z) is normalized, 


f ug(e)dx = 


and we shall suppose that this is done. We now set 
L{u) = uo (x)uo (2) 
and determine a function @(z,z) that has the same properties as we required 


of the simple Green function, except that it satisfies the equation L(G) 
= Uo(t)uo(z) instead of L(G) = 0. We finally require that 


f G(2,z2)uo(x)dz = 0 (14-48) 


The resulting modified Green function, which is symmetric, satisfies the 
inhomogeneous differential equation (39) including its boundary conditions. 
The proof of these facts is similar to that used in the case of the simple 
Green function. 

Problem. Find Green’s function for E(u) =u” with u(0) = u(1) = 0, Hint: 


‘Jet ur(z) = z; ue(z) = 1. 
Ans.: See Table 1, sec. 14.9. 


Example. Suppose L(u) = u” = 0; u(1) = u(—1); u’(1) = u’(—1). 
If we substitute the two linearly independent solutions of the preceding 
problem in (44) we see that dG(zx,z)/dr = +3 + A, hence the second 
boundary condition cannot be satisfied. A solution of the differential 
equation which does satisfy the boundary conditions is % = constant or 


when normalized up(z) = 1 /v2. Hence we seek a solution of the equa- 
tion L(u) = u” = up(z)ug(z) = $. This is u = z?/4. Using (44) and 
the results of the last problem we see that 


_ 2 
G(ez) = + E-A 4 set B+e 

which gives A = —z/2 when the further condition G(z,z) = G(—x,z) is 

imposed. Omitting the‘constant factor ug(z) = UV 2 we now determine 

B so that (48) is satisfied. This requires that 


[Sa -JE Dict f -24e hae = 0 


9 A different procedure is possible in some cases; see Lovitt, loc. cit. 
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The result is B = 4 + z?/4, so that finally 


_ , @-2) , @-2z)? 1 
G(@z) = + > Toa +e 


This will satisfy all of the boundary conditions. 

14.8. The Inhomogeneous Sturm-Liouville Equation.—Having proved 
that we can convert (39) to an integral equation, we wish to give explicit 
forms of the latter for different (z). Suppose 

$2) = Awu — x(x) 
so that (39) becomes 
Liu) + wu = x(x) (14-49) 


The resulting integral equation is 

ulz) =» f G(e2)we)ue\ae + gla) 
(14-49a) 
al) = ~ f Geax 


which is equivalent to (49) and its boundary conditions. Finally if 
x(x) = 0, the homogeneous differential equation 


E(u) + Awu = 0 (14-49b) 
and its boundary conditions become equivalent to 
u(z) = f G(a2)w(2)u(z)de (14-50) 


but the kernel in this case is not symmetric unless w(x) = 1. If that is 
true (50) is a homogeneous integral equation and can be solved by the 
methods of sec. 14.3b. If w(z) = 1, we may introduce a new unknown 
function 


ylz) = u(z)V w(x) 
multiply the integral equation by V w(x) and obtain 


ula) =» f Hay (ede 


where we now have, a symmetric kernel H(z,z) = G(a,2)V w(z)w(z). 
Eq. (49b) forms the basis of the Sturm-Liouville theory which was dis- 
cussed in sec. 8.5. 

Let us consider (41) and (49b) further. We write 


Lv) +d = 0; L(u) =0 (14-51) 
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and suppose that their Green functions. are known so that 
u = Gen); v = T(z) (14-52) 


Substitute these relations in Green’s formula (46), use (40) and arguments 
similar to those which proved that Green’s function is symmetrie. The 
result is 


Pont) = Gn) +r f Cem P eae 


For fixed £, this is recognized as identical with (2) where T is the unknown, 
G (Em) = f(n) and G(z,n) is the kernel. If we now change z, n, £to 21, 2,2 
and remember that the kernel is symmetrie we obtain 


T (z2) — Gaz) = rf G(x,21)T (zz sr dz, 


which shows by comparison with (10) that T (x,z;\) is the resolvent of the 
kernel G(z,z:). We may thus use equations of the form of (2) or (10) to 
find the solution of either form of (51) when the appropriate Green func- 
tion (52) is known. Finally, referring to (17) and the result of Problem b, 
sec. 14.3, we see that 


DAN _ dln DQ) | 
DN ` 


which will give D(A) by integration over \ and hence the eigenvalues from 
the relation D(A) = 0. 


-f P(x,2j;\)da 


Problem. Find Green’s function for L(u) = u” + k?u with the boundary condi- 
tions of the previous problem. Hint: take ui(x) = cos kz; ue(r) = sin ke. 
Ans.: See Table 1. . 


14.9. Some Examples of Green’s Function.—For convenience of 
reference, we list in Table 1 Green’s function for some important differen- 
tial equations. The following boundary conditions include those most 
often encountered: 


u(0) = u(1) = 0 ; 
u(—1)'= u(1); w(~1) = v’(1) 
u(0) = u’(1) = 0 
u(—1) = u(1) = 0 
u(0) = —u(1); u(0) = —u’(1) 
u(0) = u(t) = u’(0) = u'(1) 
u(x) finite; ~% <2 < © 


Rm 2 oo op 
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When the limits are a and b, the appropriate Green function G(X,Z) may be 
found from our results by the transformations 


Z-a, 2-278 
~~ b=-a’” b-a 


for if @(z,z) is bounded by (0,1) then G(X,Z) is bounded by (a,b). The 
method of calculating Green’s function in each case is identical with that 
described in the preceding sections. When only one equation is given for 
G(z,2) it refers tox < z; for x > z, interchange z and z. 
In addition to the results found in Table 1, Green’s function for several 
other differential equations will be given (see also Table 1 in sec. 8.5). 
For the Legendre differential equation 


Llu) =[(1 — 2)’; -1<2<1 


z (14-53) 


The boundary conditions are that the solutions remain finite at z = +1. 
Green’s function is 


Gaz) = —pln[(1 — z)(1+2)]}+m2-—4 (14-54) 
The associated Legendre differential equation is 


2 
a- wY - "<0 
and . 
o1 ja+a -a 
G(z,2) = EREE ; mxO (14-58) 


For m = 0, the proper Green function is (54). 
The zero-th order Bessel equation is L(u) = (zw) =0. With the 
boundary conditions u(1) = 0; u(0) finite 


G(a,z) = — Inz (14-56) 
The n-th order equation is 


and 


aea = +] (2) - e] 


with the same boundary conditions as for the zero-th order equation. 
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TABLE 1 
L(u) Boundary Condition G (zz) 
1 u” (1 — z)z 
2 u” łe —z2) +4 -4l -z| 
3. u t 
4 u” —ġ ijzz] +z- 1} 
5 w” -ġlz -z| +ł 
6. u” none exists 
T oupa sin ke sin kU — 2), => 0 
k sin k 
n = 
N —— ~ 
8. u + Me Oh sink cos k(x — z + 1) 
9 ul — du sinh kz sinh k(L — 2} 
k sinh k 
1 
10. u” — du —— cosh k(x — z +1) 


2k sinh $ 


1. u” — du cosh kz cosh k(1 — z) 


k sinh k 
12. u” — u jela 
13. wl” z= 1)? (azz + z — 32) 


6 


APPLICATION TO PHYSICAL PROBLEMS - 


14.10. Abel’s Integral Equation.—One of the earliest applications of 
integral equations to a physical problem was made by Abel (1823). Con- 
sider a particle which falls along a smooth curve in a vertical plane. Let 
its original position above a given horizontal plane be Zo, its position at 
time ¢ be z and at the end of its fallbez=0. Let ds be the distance trav- 
elled in time dt. Then if the particle moves under no force but mg, the 
force of gravity, its velocity 


v= « = V 29 (zo — 2) (14-57) 


The whole time of descent is 
1) - f a= fi = -z f° 50% 
0 Vgl =z) Vaglo Va —z 
If the shape of the curve is given in terms of z, 
8 = s(z) 
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then the time of descent may be calculated. The reverse problem studied 
by Abel is to find a curve for which the time T is a given function of z, 
T (zo) = f(zo) (compare the brachistochrone problem, sec. 6.1b). We 
thus wish to find 


elz) = -20 >0 
or 
L [74e _ 
Jo) = J Veni (14-58) 


which is a Volterra integral equation of the first kind. The presence of the 
singularity at z = z makes it necessary to solve the equation in the manner 
of sec. 14.2c. The details may be left to the reader. 

14.11. Vibration Problems.—a. The homogeneous string treated in 
Chapter 7 was reduced to the eigenvalue problem (cf. eq. 7-33), 


8” (2) + k?S(x) = 0 


If we make the proper change of variable so that the boundary conditions 
are S(0) = S(1) = 0 we see that the differential equation is similar to 
(49b), the boundary conditions lead to Green’s function (1) from Table 1 
and the resulting homogeneous integral equation is of the form of eq. (50) 
when à = k? and w = 1. 


b. Forced Vibrations, Suppose the string is subjected to a periodic 
force f(z) cos (St +5). Then if we set v = 1 in eq. (1) of Chapter 8 we 
have 
= = U” + f(z) cos (8t + 8) (14-59) 
with boundary conditions U (0,4) = U(1,t) = 0. We seek a solution of 
the form 

' U = S(x) cos (Bt + 8) 
which reduces (59) to 
S” (2) + 8S) = —f(z) (14-60) 


if we remember that S(0) = S(1) = 0. This differential equation is like 
(49) and the integral equation like (49a) with kernel identical with that 
of the homogeneous string. The integral equation may be solved provided 
6° is an eigenvalue and f(x) is orthogonal to the eigenfunctions of the homo- 
geneous equation. We know from Chapter 8 that the latter are sin nrz, 
hence the required condition is 


fic sin nredz = 0 
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If 6’ is not an eigenvalue, solutions are still possible. Following the pro- 
cedure of sec. 14.8, we look for Green’s function of eq. (60) which is given 
as item (7) in Table 1. This is the resolvent of our integral equation, 
hence from eq. (10) the unique solution of (60) is 


1 
S£) = g(2) +B J. P(c,e)9(2)de 


~ f G(,2)f(e)de 


il 


giz) 


c. The Suspended Rope. Let a rope of unit length hang in its equi- 
librium position from the point z = 1. If it executes small vibrations in 
a vertical plane, its equation of motion is 


air a (zôU) 
ae ðr ðr 


with U as its displacement. The horizontal component of its tension at z 
is z (4U /dx), so the boundary conditions are U (1) = 0, U (0) finite. Writ- 
ing U = u(z)ọ(t) we obtain 


[zu (1) + k?u(x) = 0 
o” U) + kult) = 0 


The proper Green function for the homogeneous differential equation in z 
is eq. (56). 
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CHAPTER 15 


GROUP THEORY 
PROPERTIES OF A GROUP 


Group theory has become so vital a part of modern physical and 
chemical analysis that the inclusion of its basic structure seemed inevitable 
to the authors of this book. Because of the great volume of available 
material arbitrary selection had to be made, and many proofs had to be 
omitted or given only in outline. Care has been taken, however, to insure 
that the attentive reader of the present chapter will be able to familiarize 
himself with all the tools needed for handling the simpler problems of 
group theory, such as those arising in quantum mechanics and in the field of 
molecular structure. A certain amount of material, easily obtained by the 
methods discussed in this chapter, but of somewhat lengthy derivation, has 
been collected at the end in Table 7. 

15.1. Definitions.—A group! is a set of abstract elements A, B, C, - 
finite or infinite in number, with a law of combination for any two elements 
A and B to form a product” AB such that: 


a. Every product of the two elements and the square of every element 
is a member of the set. 


b. The set contains a unit element E for which EA = AH = A for 
every member of the set. 


c. The associative law holds: A(BC) = (AB)C. 


d. Every element has an inverse, X = A}, so that AX = AA™ = 
AA = E. 

The set of all integers, positive, negative and zero, forms a group if the 
law of combination is addition. The unit element is zero and the negative 
of every element is its inverse. These numbers do not form a group if the 
law of combination is multiplication. In this case, H = 1, but the element 
zero has no inverse hence (d) cannot be satisfied. For any law of combina- 
tion, we always speak of a product and write the two elements as if they 
were multiplied together. 


L For general treatises on group theory, see references at end of this chapter. 
2 Yollowing the convention of sec 10.10, it is to be understood throughout this 
chapter shat the elements of a product are to be taken in the order from right to left. 


545 


16.2 GROUP THEORY 546 


A finite group of order g contains a finite number of elements, g. A 
simple example of such a group (of order four) is furnished by the numbers 
+1, +1. If n is the smallest integer for which X” = E, n is called the 
order of the element X. The n elements X, X?, X°, .- +X"), xX" =H 
form the period of X, indicated by {xX }. The period of a single element is 
thus a finite group; it is called a cyclic group. 

All of the groups so far mentioned have the property that AB = BA 
for every element. When this condition is fulfilled, the group is said to be 
Abelian. Two or more cyclic groups (they are also Abelian) may be com- 
bined to form a single group which is non-Abelian. Suppose 


A? =E; C= EF: CA = AC (15-1) 
then the group, which we designate by Da (for reasons which appear later) 
is of order six with elements E, A, A?, C, AC, A?C. The products of these 
elements may be arranged in a multiplication table ; CA, for example, is 


found at the intersection of row C and column A. If we let A? = B, 
AC = D, A?C = F and use (1) we obtain for the group D; 


(15-2) 


sow e | 
lb OQ b& > by 
SPQ mh we > 
Q&S & & 
wa aa Da 
n Qahy 
mim & QS 


It should be noticed that each element occurs once and only once in each 
row or column. 


Problem a. Use (15-1) to derive the multiplication table of (15-2). 

Problem b. Show that if any element occurs more than once in a row or column 
of a multiplication table for a group then the group postulates (a)-(d) could not be 
fulfilled. 


15.2. Subgroups.—A group whose elements are contained in another 
group is called a subgroup. Thus we may always find subgroups in any 
group by forming the period of each of its elements. For example, in D3 
a subgroup of order three is obtained from {A} = {B} = E, A, B. Simi- 
larly, three different subgroups, each of order two, may be found: {C} =B, 
C; {D} = E, D; {F} = E, F. In addition to these subgroups, the single 
element # is a subgroup of order one while the group itself is a subgroup of 
order six. In this case, each subgroup, except the group itself, is cyclic. It 
does not follow, however, that all subgroups are cyclic. 

Suppose a given group is of order g and a subgroup of it is of order A 
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with elements Ay, Ao, +- sAr Now take B, an element of the group which 
is not contained in the subgroup, and form the products BA, BA, » 
BAr. These must all be‘in the group but none can be in the subgroup, for 
if BA; = A; were one of the members of the subgroup then B = 4;4;7! 
would also be in the subgroup which is contrary to our assumption concern- 
ing the selection of B. We have now found 2h members of the group. If 
2h < g, it will be possible to find a new element C contained neither among 
the elements Aj, ---, A, nor among the elements BAy, - - +, BAr Repeat- 
ing the operations of multiplication and using the same arguments as before 
we obtain h new elements CA}, - - ` CAnr Since the group is of finite order, 
the procedure must end when we have found kh = g elements (k an inte- 
ger). It thus follows that the order of the subgroup must be a divisor of 
the order of the whole group. In the example of the preceding paragraph, 
we see that we have found all possible Subgroups since the only divisors of 6 
(the order of the group) are 1, 2, 3, and 6. 

15.3. Classes.—Let A, B and X be any three elements of a group} 
then if B = X~"AX, Bis said to be the transform of A by the element X: 
A and B are conjugate to each other. The following properties of conju- 
gate elements may be proved ; it is easy to verify them for D; by the use 
of the group table (2), 


a. Every element is conjugate with itself. 
b. If A is conjugate with B, then B is conjugate with A. 


c. If A is conjugate with both B and C, then B and C are conjugate 
with each other, 


The complete set of elements C = Aj, Ao,-++, Ay, which are conjugate 
with each other, is called a class of the group. If the group contains the 
elements A; (= E), Az, +, A, the class of A may be found by calculating 


EAE =A, Aj Mda =, A, AA, 


although not all of these elements will be distinct as may be seen from the 
following example. Clearly €, = Æ always forms a class by itself. In 
(2), Cp = A, B, for 

E"AE =A; BOAB=A; D”AD=B 

AAA =A; COAC=B: FOAP=B 


Similarly C3 = C, D, F. By arguments similar to those used in discussing 
subgroups it follows that the whole group may be separated into a number 
of different classes none of which contain any elements in common. More 
over, if there are 4 elements of a group which, transform a given element into 
another element of the same class, then the number of elements in that 
class r = g/h where g is the order of the group. 
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15.4. Complexes.—A set of elements from a group, considered as a 
whole, is called a complex. If the complex @ contains A, B, C then CA 
contains CA, CB, C*. By the product of two complexes @B we mean the 
product of every element in @ with every element in B, but products occur- 
ring more than once are only taken once. By the complex G we mean the 
whole group. If JC is a subgroup, then 


HK =H? =H (15-3) 


If X is an element of $ not contained in KH then the complex HX is called 
a right coset (Nebengruppe) and XK is a left coset. Cosets are not groups 
since KX does not contain E. Itis easy to see that if another element F is 
neither in JC nor in X X, then the coset XY will contain no element common 
with K or HX, so that the whole group may be written as a sum of a finite 
number of cosets 


S=H+KHX+HKY+ HZ +- 


The group may also be divided in this way by means of left cosets. In 
D3, we may write 


G=K LHC =K+HD=KH+HP-KH+CH =K + DH 
= H+ FI 


where X = E, A, B. The indez of a subgroup equals the order of the 
group divided by the order of the subgroup. It also equals the number of 
complexes obtained by splitting a group into that particular subgroup and 
its cosets; two, in the example just given. 

15.5. Conjugate Subgroups.—If a subgroup K contains the elements 
H, (= E), Hoe, +++, Hp then it also contains FH; = H;, HH; +++, HH; 
for every H in K, and it contains H7 ‘E = H7 t, B7 Ha, Hy TA In 
fact these arrangements of the h elements of K are identical except for the 
sequence in which the members are written. Still another arrangement is 
Hy1EH; = E, Hy HH, ++, Hz 'H,H;. To see this, sort out the arrange- 
ment EH; = H;, +--+, HH; so that the natural order Hı, Ha, ---, Hp is 
regained and multiply each element by H7. A similar argument will 
show that for X, any member of the group (not necessarily contained in X) 
XKX is also a subgroup, but XKX and I, called conjugate subgroups, 
may be different if X is notin K. When X and XKX are identical for 
every X in the group, JC is called an invariant subgroup or a normal divisor. 
To illustrate these statements choose X = F, C and KH = E, A, B from 
D;. It is easily verified that the only invariant subgroup of Dg is K = F, 
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A, B. The invariant subgroup and its cosets form a group? called the 
quotient (or factor) group with the invariant subgroup as unit element. In 
Ds, if F = KC, then the multiplication table of the quotient group G/K is 


£ F 
xIx F (15-4) 
FIF 4 


15.6. Isomorphism.—Two groups § and 8’ are said to be simply iso- 
morphic if to each element A, B, C, --- of S there corresponds an element 
A’, B’,C',-+- of O’ so that if AB = C, then A’B’ = C’ for every product. 
In the general case, two or more elements of one group may be isomorphous 
with a single element of another group. Thus the quotient group (4) is 
multiply isomorphous with Ds, for X corresponds to E, A, B and F to 
C, D, F. 

In order to find a group which is simply isomorphous with D3, we con- 
sider the n! permutations of n symbols. By (acbed) we shall mean a 
replaced by c, ¢ replaced by b, b by e, e by dandd bya. This may also be 
written as (bedac) or (dacbe) as long as we do not change the cyclic order of 
the symbols. When a single letter occurs in a parenthesis, that letter is 
unaffected by the permutation, hence we will write (bce)(a)(d) as (bce). 
By the product of two permutations, we mean the permutation directed 
in the right parenthesis followed by the permutation in the left parenthesis, 
For example, in the product (acbed) (bce), bis replaced by e and then c by b, 
the net result for b being that it returns to its original position. Continuing 
in this way, we obtain (acbed) (bce) = (eda). If we use only three letters 
and write 


E = (a)(b)(c) B= (abc) D = (ac) (®) 
A = (acb) C = (a) (be) F = (ab)(c) 


the resulting operations form a group which is simply isomorphic with Ds, 
for 


(15-5) 


AB = (acb) (abc) = (a)(b)(c) = E 
BC = (abc) (bc) = (ab) =F; ete. 


Problem. Derive the complete multiplication table for the group of permutations 
on three letters. 


3 Note that the elements of the quotient group are complexes, i.e., collections of the 
original elements of S. 
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15.7. Representation of Groups.—If to every member of a group 
Aj, Aa, Ag, +", we can associate a square matrix D(A1), D(A), D(A), +++ 
in such a way that if A,A; = A, and D(A;)D(A;) = D(A), then the 
matrices themselves form a group isomorphous with 8. Such matrices 
are a representation of the group; their order is the degree or dimension of 
the representation. One trivial example of a representation is the unit 
matrix E associated with every element of the group. A representation 
for D3 may be obtained from its quotient group if we associate with the 
matrix [1] and F with the matrix [—1]. 

To find another representation of D3, let us think of the symbols 
a, b, ¢ as the components of a vector x and the elements of the group as 
operations which change x into a new vector x’ with the same components 
but in a different order. Hence the required representation D will be a 
matrix such that x’ = Dx where the rows and columns are labelled with the 
components a, b, & Now E is the operation which replaces each component 
by itself so D(X) is the unit matrix. On the other hand, A replaces a by 
c, but a itself becomes b, ete., so unity will appear in D(A) at the inter- 
section of the a-th row and the b-th column, ete. Continuing in this way, 
we find. 


10 0 010 001 
DE) =|0 1 0]; D(A)=}0 0 11; D) =l1 0 ol; 
001 10 0 010 
100 [oo 1 0 1 o] (5-5a) 
D(C)=|0 0 1}; DD) =!0 1 Of: De) =]1 0 0 
010 100 001 


By multiplying the matrices together, it will be seen that the multiplica- 
tion table (2) is reproduced. For example, D(A)D(B) = DF) and 
D(A)D(C) = D(D). Thus (Sa) is a representation of Ds, 

Suppose a representation of a group has been found, consisting of 
matrices D = D(A,), D(da), D(A,), each matrix being of dimension 
n. Then it is often possible to find a new coordinate system, i.e., a trans- 
formation of the type Q-!.DQ, such that every matrix D is changed to the 


form 
D 0 
Kaen (15-6) 


_ Where Dy is of order m, m < n and D, is of order (n — m). Under these 
conditions, the representation D is said ta be reducible into Dı and Do. 


“In the more general case, the matrices are converted to the triangular form of 


(10-40). If the form obtained is that of (6), the representation is said to be completely 
reducible. 
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We now examine D; and D; to see if they are reducible, continuing until D 
is completely reduced. When this has been accomplished, we will have a 
relation between the original and final coordinate systems such as z = Qx 
and 

QUDQ = diag I, FO... PO) =r (15-7) 


where the I“ are themselves matrices. 

It should be understood that if there are g elements in the group, there 
will be g equations like (7), one for each element; D means the set of g 
matrices in the original coordinate system and I’ means the same matrices 
in the new coordinate system. Suppose there are s irreducible representa- 
tations in (7), PY, r@,...,. DP: each one of these is a set of g matrices, 
one for each element of the group, PM = PO (41), DP (Ag), » --, DO(A). 
Each I is isomorphous with the corresponding D in the original coordi- 
nate system since the two sets of matrices are related to each other by a 
collineatory transformation (cf. sec. 10.11). 

It may happen that some I) may appear more than once or not at all 
in the reduction of a given representation. To indicate this, we rewrite (7) 
as 

r= er) + eh) +e eT) (15-8) 


where the c’s are positive integers or zero. Such an expression, called the 
direct sum, is not meant to imply that the I) are to be added. It is 
simply a shorthand method of showing that the matrices D have been re- 
duced to the form (7). 

It is of considerable advantage to choose unitary or orthogonal matrices 
as the representations of groups and we shall suppose that this is always 
done. Under these conditions the following statements may be proved.’ 
Two irreducible representations will be orthogonal, and if d; is the dimen- 
sion of M, then 


Lr t rO = 8178 a p96. ( 15-9 ) 


(di + Uap” 
the summation to be made over the g elements of the group, Ai, A2, ---, Åp 
Moreover if there are s classes of elements in a group, there will be exactly 
s different irreducible representations and 


d ++- +h =g (15-10) 
It is not always possible to obtain all s of the irreducible representations 
from a single set of reducible matrices D since some of the c; in (8) may be 


zero. If this is the case, another set of matrices D’ must be found and 
these must be reduced in the same way until the complete set is obtained. 


5 See texts on group theory cited at end of this chapter. 
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15.8. Reduction of a Representation.—We now wish to show how it is 
possible to find all of the irreducible representations for Da. Since g =6 
and s = 3, it follows from (10) that they are of dimension 1, 1 and 2. To 
find the two representations® of degree one, we consider the quotient group 
(4) with two classes, Ct containing X and C containing F. Its two 
representations are IM (C+) = PY (C7) = 1; PR Cty = 1; P(C) = 
~ 1. While these are almost trivial, it is seen that they satisfy all of the 
requirements for a representation of D3. They are therefore taken as its 
two representations of degree one. 

In order to obtain the representation of dimension two, we attempt to 
reduce the matrices of (5a). We expect to get, as a result, matrices of 
the form of (6) where D, is either P or D®, and Dy is a set of two- 
dimensional matrices. We note that each of the matrices of (5a) is 
orthogonal and from the discussion of sec. 10.17 we see that another real 
orthogonal matrix will reduce any of them to the desired form. The 
columns of the reducing matrix will be composed of the eigenvectors of one 
of the matrices to be reduced. If we choose D(A) we find that its eigen- 
values are 1, e***, where ¢ = 27/3. Taking linear combinations of the 
complex eigenvectors and normalizing them the result is 37"/2[1, 1, 1]; 
GPN, —2, 1]; 271-1, 0, 1. They form the columns of a matrix 
Q which will reduce each matrix of (5a) by the transformation indicated 
in (7). The diagonal elements will be DO? and two-dimensional matrices 
which were sought. A typical result is 


. i 0 0 
QOD(A)\O=|0 ~1/2 -vV3/2 (15-11) 
0 6V3/2 —1/2 


Other methods” of reducing a given representation may be found. Con- 
sider, for example, the effect of the matrices (5a) in changing a vector 
x into another vector x’ by the relation x’ = Dx. In such an operation 
two components of the vector are simply interchanged in their original 
plane or else both of them are transferred to a plane perpendicular to 
the one in which they originally lay. These relations could be examined 
in a new coordinate system in which z; is along the normal to the plane 


8 The reader will recall that the complex JH must be regarded as a single element of 
the quotient group SK. It is true that JC is made up of the elements Æ, A, and B of 
the original group Dg, but it acts as the unit element of S /K. The other element 
of the quotient group, F, contains the elements C, D, F, of D3. 

7 A formal method, based on hypercomplex numbers in a general type of algebra, 
called Frobenius algebra, is described by Speiser, Littlewood, and other references 

cited at the end of this chapter. l 
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determined by zz and 23. Calling the new system y, not necessarily a 
rectangular Cartesian one, t2 could be taken in the plane of y1, y2 and 
zz in the plane of y2, y3. The relation between the new and old coordinates 
is then x = Sy, where xı = yı + yo + Y3; V2 = Yı — Y2; t3 = Y2 — Ys. 
In this new system, x’ = Sy’ and, since S is non-singular, y’ = S'DSy, 
according to sec. 10.13. When the reciprocal matrix is found (note that 
S is not orthogonal) 


1 0 0 
Sop(aA)S=|0 -1 1 
0 -1 0 


with similar results for the remaining matrices of (5a). We note again 
that we have found I) and two-dimensional representations. 

Although these two reductions do not give identical results, their 
matrices are related by a similarity transformation and their traces are 
identical as can be seen by comparing S1D(A)S with eq. (11). The 
importance of this property is explained in the next section. 

In the usual case, it is easier to find another representation of the 
required dimension than to reduce one already known. Consider a plane, 
equilateral triangle with apexes labeled a, b, c and located ina Cartesian 
coordinate system so that the coordinates of its apéxes are a = (1,0); 
b = 21(-1, V3); c= —4(1, v3). The elements of the permutation 
group (5) will then be seen to correspond with the following opérations on 
this triangle: (E) identity; (A) rotation of the triangle about the origin 
of the coordinate system through the angle 27/3 in the counter-clockwise 
direction; (B) rotation by 47/3 in the same direction, or by 27/3 in the 
clockwise direction; (C) rotation through the angle 7 about an axis lying 
in the plane of the triangle and passing through y = 0; (D)a similar rota- 
tion about an axis through y = — /3x, which passes through the apex b 
of the triangle; (F) rotation aboyt an axis passing through the apex c, 
ory = V3z. 

Since we are considering a space-fixed coordinate system and we are 
moving the triangle rather than the coordinate system, the appropriate 
two-dimensional matrices for A and B are the transforms of eq. (52) given 
later in this chapter, with ¢(A) = 27/3 and ¢(B) = 47/3. Operation C 
merely changes the sign of the y-coordinates and its matrix is that of 
eq. (61). The two remaining matrices for D and F are easily obtained 
from the multiplication table for the group, for example, D = CB,F = CA. 
They could also be found from geometric considerations. 

We now have the three irreducible representations for Da. The one- 
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dimensional representations have been given in the first paragraph of 
this section. Using the matrices obtained from operations with the 
triangle, the abbreviated notations #, A, B, ..., instead of the more 
explicit forms P? (F), ete., and writing ¢ = cos ¢ = —1/2; s = sin ọ = 
V3/2, $ = 2n/3, the two-dimensional representations are given in Table 1. 


TABLE 1 


15.9. The Character.—The task of finding all the irreducible repre- 
sentations of a given group is usually very laborious, However, for most 
physical applications, it is sufficient to know only their trace, a quantity 
called the character? in group theory. We shall indicate the trace of DP 
by x® = x® (41), x (49), ete. A further simplification is afforded by the 
fact that elements in the same class are obtained from each other by a 
similarity transformation, hence the character of every element in a single 
class is identical. This follows from the fact that elemenis in the same class 
are related to each other by a similarity transformation and, as we have 
shown in sec. 10.1!, the trace of two quantities so related is identical. 
Therefore, if we know all the characters of one element from every class 
of the group, we have all of the information concerning the group which is 
usually needed. We shall indicate the particular class to which we refer 
by a subscript, so that the s characters {, xs", «++, x refer to the 
i-th irreducible representation. 

The following properties of the characters may be derived? or verified 
using tables of characters given in later sections. 

„a. The class C, = E is always represented by the unit matrix, thus 
xi” equals the dimension of the representation and hence must be a divisor 
of the order of the group. We also see from (10) that 


£ xPP = g (15-12) 


8 The character (especially of permutation groups) is treated in detail by Little- 
wood, D. E., “ The Theory of Group Characters,” Oxford University Press, 1940. 
? Cf. Speiser, loc. cit., Chapter 12. 
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If g and s are known, it will usually be found that there is but a single way 
in which this equation can he satisfied. 


b. From (9) it follows that the s characters also form an orthogonal 
system. Summing over the classes we obtain 


& 
Era xr * = gij (15-13) 
q= 


where rq is the number of elements in the g-th class. 


c. If Z is the character of a reducible representation, then from (8), 
we have 


E = xP + cox Ho cox” (15-14) 
On multiplying this by x{* and summing over q, we obtain, using (13), 
I toy Ga 
Cj = — LrghaXg (15-15) 
J y=i 


When the complete multiplication table for a group is known, the follow- 
ing procedure!? may be used to obtain the characters. First calculate the 
product of all elements in the class C; by all elements in C,. It will be 
found that the resulting set of elements may be uniquely arranged in classes 
and that the same results are obtained irrespective of whether we multiply 
C; by C, or the reverse. Now a given class may occur in the products 
several times or not at all. Let us use Ay; to indicate the number of times 
the j-th class appears. Then if we abandon our carlier rule for the multi- 
plication of complexes (cf. sec. 15.4) and take each element of the product 
as many times as it oceurs, we may write 


CL, = Cl; = Cha 
j=1 


where we sum over the total number of classes, s. Having found the num- 
bers hiz,; it is then possible to find the characters from the relations 


8 
THEXIXK = X1 Rahin, 5X (15-16) 
ja 
where r; is the number of elements in C;. 
As an example of the use of this equation, we find for Dg 
@= A*, B, AB, BA =2Q,4+ 
CF = 36 +36; C6; = 26 


10 Proof of the statements in this paragraph may be found in Murnaghan, p. 83 + 
Speiser, p. 170, loc. cit. They may be verified by using the multiplication table for Dg. 
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(The other products are not needed.) Since rı = 1; re = 2; rg = 3, we 
have. 
4x3 = x1 (2x1 + 2x2) 
9x5 = x1 (8x1 + 6x2) (15-17) 
6xex3 = Öxıx3 
From (12), we know that xı has the values 1, 1 and 2. Solving (17) with 


each of these quantities in turn we obtain the entries in Table 2. They are 
identical with the trace of the matrices of the last section. 


il 


TABLE 2 
C Cs C5 
To 1 1 1 
TO 1 1 E 


T® 2 1 0 


Let us apply eq. (15) to the matrices (5a) and confirm a fact that we 
already know, namely, that these reducible representations contain T® 
and I) once each but not, From (5a), we see that =, = 3; žo = 0; 
Ez = 1, hence, using eq. (15), 


a = (1.3.1 + 2.0.1 + 3.1.1)/6 = 1 
co = (1.3.1 + 2.0.1 — 3.1.1)/6 = 0 
cs = (1.3.2 — 2.0.1 + 3.1.0)/6 = 1 


We have shown how a reducible representation of the group D3 may be 
found (cf. sec. 15.7). Now elements occur on the diagonal of the matrices 
of eq. (5a) only when the symbols are unchanged by the permutations with 
which Ds is isomorphous... Since these diagonal elements are all unity, the 
reducible character of an element of a permutation group is equal to the 
number of symbols unchanged by the permutation. This result is very 
useful, for every group is isomorphous with some permutation group; 
hence when the latter is known it is a simple matter to find Z,. 


f 
Probiem. Derive Table 2 by the method described in the text. 


15.10. The Direct Product.—Two cyclic groups were combined in (1) 
to form a single larger group. We now describe another method of aug- 
menting the order of a group. Suppose 8’ is of order m with elements Ai, 
Ad, --+, Am and @” is of order n with elements B,, Ba, -+-, Ba and that 
every A commutes with every B. Then the mn elements A;B; form a group 
of order mn called the direct product of S’ and 8”, 8 = 8’ x 9”. If the 
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matrices I' (4) and T' (B) are irreducible representations of @’ and 9’’, then 
their direet product 


T (4) X T(B) = F(AB) 


is a representation of 9. Moreover, if x? is a character of sin 9’ and 
xi?) belongs to C, in 9’’, then the st characters of Cas in § are given by 


aff? = Pah? 

If one or both of the representations ‘(A ) and I' (B) are of the first degree, 
the direct product I (AB) is irreducible. .If both are of degree higher than 
one, (AB) is reducible. The reduction is very simple provided the table 
of characters for both groups is known, for multiplication of one set of 
characters by another will give a sum of characters already contained in 
the table. This can always be uniquely resolved into its component parts. 
An illustration of such reduction will be given in sec. 15.18. 


SOME SPECIAL GROUPS 


15.11. The Cyclic Group.—lIf a eyclic group is formed from {A}, 
A” = E and X is any element of the group defined by X = A™, m= 1,2, 
... n, then XTAX =A. It thus follows that every element of a cyclic 
group or any other Abelian group is in a class by itself. Moreover, we see 
from (10) that the n irreducible representations will each be of degree one 
so that each representation is also a character. Now if «e = exp (Q7i/n), 
then e will be a representation and a character for A and e” will be a charac- 
ter for A” (m = 1, 2,---, n), since these n numbers will satisfy the multi- 
plication properties of the group elements. Moreover, €” will also serve 
as a set of characters for the same reason. In fact the n distinct powers of 
e (m = 1, 2, +++, n) will give the n characters for each of the n elements. 
They are shown in Table 3. We can simplify such a table by using 
de Moivre’s theorem: € = cos 2rp/n +i sin 2rp/n. For example, if 

= 4, the only numbers that will occur are +1 and +i. 


TABLE 3 
A 
CQ, =A =F C= A C3 = A? tae @, = At 
i 
TO 1 1 1 i 
r® 1 Pi e1 
T l 1 ` m- am) ; ; ' "2D (m1) 


rT™ 1 el era aa ACs bd 
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15.12. The Symmetric Group.—Consider a particular permutation of 
five letters which we write as 


abecede 
P = 
° ( e ba ‘) 
This is to be interpreted as meaning: a is replaced by c, b by e, c by b, 
d by a, and e by d. A more convenient and equivalent form for such a 


permutation is 
P, = (ached) 


which we have already used in sec. 15.6. The one-line form is called a 
cycle; its degree equals the number of letters in the parenthesis. It will be 
found that any permutation may be written as a single cycle with no letter 
repeated, or as a product of two or more cycles, none of which has a letter in 
common. Provided their proper sequence is retained, the letters in a cycle 
may be rearranged, but the number and degree of all cycles corresponding 
to a given permutation is unique. For example, 


Py = ( : . ? ‘) = (ac) (bed) = (bed) (ac) = (dbe)(ca), ete. 

A cycle of degree two is called a transposition. A cycle of higher degree 
may be rewritten as a product of two or more transpositions in several 
different ways, but then the product will contain the same letter or letters 
in two or more parentheses. However, if the original cycle contained an 
even number of letters, the product of transpositions will be composed of an 
odd number of transpositions and if the original cycle contained an odd 
number of letters, the product will have an even number of transpositions. 
Since any permutation may be decomposed into a product of eycles, and 
each of the latter may be written as a product of transpositions, it follows 
that any permutation may be factored into a product of transpositions. 
Moreover, all the different products corresponding to a given permutation 
contain either an even number or all contain an odd number of transposi- 
tions. This property of a permutation is unique, and permits us to speak 
of even and odd permutations, P, and Po. As examples, we see that 


Pe = (ae) (ab) (ae) (ad) = (ac) (cb) (be) (ed), ete. 
Po = (ac) (be) (ed) = (ca) (be) (bd), ete. 
The symmetric group of ordern! is defined as the group of all permuta- 
tions, both even and odd, of n letters. The set of n!/2 even permutations 
of n letters forms a subgroup of the symmetric group, of order ni/2: it is 


called the alternating group. A simple consideration shows it to be an 
invariant subgroup. The odd Permutations contained in the symmetric 
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group do not alone form a group, since the product of two odd permuta- 
tions is even. However, the complex of odd permutations is one of the 
elements of the quotient group of order two which is isomorphous with 
the symmetric group, the other element being the complex of even per- 
mutations. 


Problem a. Construct elements and group table for the symmetric group on four 
letters. Decompose all elements into transpositions. 


Suppose a permutation has been factored into a cycles of degree one, 
6 cycles of degree two, etc. We describe this arrangement by the symbol 
(12°37 . . -) which is called a partition. It is easy to see that any permu- 
tation P and its inverse P~' will belong to the same partition, for P7! 
is formed from P by reversing the order of the letters in the cycles of P. 
Thus 
Po = (ac)(bed); Py’ = (va) (deb) 


It is also true that elements in the same class belong to the same partition 
and that there are as many classes as partitions (ef. Problem a). Now if 
the total number of letters in a permutation is 7, we must have 


a+ 26+ 3y+---=n7 


hence the number of possible partitions or the number of classes equals the 
number of distinct solutions of this equation in positive integers or zero. 

In order to find the number of elements in a class we must find the 
number of permutations having the same cycle structure. Suppose there 
are n letters and that the particular class under consideration belongs to 
the partition (1°2°3%---). There are n! ways of arranging the n letters 
but not all of the arrangements will lead to a different permutation. For 
instance, we may start a given cycle with any letter in it; i.e., (abe), (bea) 
and (cab) are identical. This fact means that 1*2°3” - - - arrangements will 
differ only by cyclic permutation within the various cycles. There is still 
another possibility of duplication. It does not matter whether we write 
(ab) (cd) or (cd) (ab), hence there are a!g!y!--- interchanges of this kind, 
each corresponding to the same permutation. We thus conclude that the 
number of different arrangements or the number of elements in a elass 
symbolized by the partition (12°37 - - -) equals 


nt 


BETEAN (15-18) 


Application of the methods just deseribed will show that for n = 4, 
there are 5 classes corresponding to the partitions (1*), (17,2), (1,3), (27), 
(4). Typical elements of each class are E = (a)(b)(c)(d); (ab); (abc); 
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(ab) (ed); (abed). The number of elements in each class is 1, 6, 8, 3 and 6, 
respectively. The complete class of (22) is Ca = (ab) (cd); (ac) (bd); 
(ad) (be). 

Probiem b. Verify the statements of the preceding paragraph, 

Two irreducible representations of the symmetric group are found 
immediately from the quotient group, for if the even and odd classes are 
indicated by Ct and C-, we have’? 

po (e+) = PO (©) = | 
TO (CH =1; Fee) = -1 
All other irreducible representations are of higher degree. From each one 


of these (and also from I? as shown in (20)), a new representation called 
the associated representation can be obtained by forming the direct product 


TO x FO = Fo (15-20) 


(15-19) © 


Both TË and FO have the same dimensions and, (IM) = ro, If 
T = TO, the two representations are self-associated. Since I = +1 
for even classes and — 1 for odd classes, it follows that: 
YMC) = =X) 

In order to satisfy (21), the character of C~ for a self-associated represen- 
tation must be equal to zero. 

Provided n < 5, a simple method may be used to obtain the complete 
table of characters for the symmetric group. When > 5, this procedure 
will not give the characters for all the classes but actually it still gives the 
characters which are of interest for physical problems. '? The restriction 
on n is not a defect of the theory, since eq. (22) which follows is a simpli- 
fied form of the general polynomial which applies for any value of n. 

Suppose there are in a given class of the group p cycles of degrees 
A Ag, tes Ap With My + às ted, = 2. Then x is the coefficient of 
zë, k < n/2, in the polynomial 


(G= 2) + 2) + 2) ++. A tae) = Eyck (15-22) 
k 


The coefficients of the highest power of z*, that is, k = 1 for n = 3 and 
k = 2 for n = 4, are the characters of the self-associated representation. 


11 We originally denoted I“ by the symbol TI, It is convenient here to use a 
different notation in order to show the relation between T and P2. 

12 For proof of this statement and a derivation of the method with n < 5, see Wigner, 
E., “ Gruppentheorie und ihre Anwendung auf Quantenmechanik der Atomspektren,” 
Braunschweig, 1931, Chapter XIII. 
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We thus obtain from (22) the characters of & representations of the group 
while those of the remaining (s — k) representations are the associated ones 
which may be obtained by using (21). We illustrate the procedure for the 
symmetric group of order 41. 

For the partition (1*), M = M = Xs = M = 1 and the polynomial is 
(1 —2z)(1+2)*. Since k < n/2, we take the coefficients of 2°, x and x? 
which are 1, 3 and 2. The last value’ 2, is the character of a self-associated 
representation as previously pointed out. The class under consideration 
is even, hence the associated characters are 3 and 1, completing the first 
column of the character table. For the next class (17,2), we have 
`i = Ag = 1, dg = 2; the polynomial is (1 — x)(1 + x)?(1 + 2?) and the 
coefficients of 2°, x and x? are 1, 1,0. The class is odd and the associated 
characters are —1, —1. The remaining polynomials are (1 — x) (1 + 2) 
(1 + 23); (1 — x)(1 + 27)? and (1 — x)(1 + 24). All of the characters 
are given in Table4. We have added the number of elements in each class 
and indicated by signs the even and odd classes. 


TABLE 4 

Class (14) (12,2)- (1,3)T (22)+ (4)~ 

No. of Elements 1 6 8 3 6 
To 1 i 1 1 
r» 3 1 _ —i 
re 2 0 —1 0 
re = Po 3 -1 ~1 i 
roe = Ta) 1 —1 i 1 —i 


15.13. The Alternating Group.—If two elements A; and A; of the © 
symmetric group are in the same class, it does not follow that they will 
belong to the same class of the alternating group. Any even class of the 
symmetric group which contains none or one cycle of odd order or no cycles 
of even order will split into two classes in the alternating group, each of the 
new classes containing half as many elements as it contained in the sym- 
metric group. For example A and B of (5) belong to the same class of the 
symmetric group with n = 3, but to different: classes of the alternating 
group, as may be verified from (2). 

The characters of the symmetric group which are not self-associated are 
also characters of the alternating group. Every character of a self-associ- 
ated representation is the sum of two equal characters for the alternating 
group except for the two classes which have been obtained by aplitting a 
class of the symmetric group. Thus if n = 3 or 4 and the character table | 
is known for the symmetric group, we can fill the character table for the 
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alternating group except for four blank spaces. Suppose the two classes 
whose entry in the table is blank are obtained from the partition 
(Az,AgAs,°-°). Then if u = MAg: the character (-1)%")/2 will 
occur in the symmetric group at the intersection of the row corresponding 
to the self-associated representation and the column of the class in question 
while in the alternating group we will have 


—]yerb/2 «rd 2 
(-—1) *kWV pt 
2 


The two remaining vacant places in the table are filled by interchanging 
the two characters given by (23). 

For n = 4, there are 4 classes since (1,3)* splits into (1,3)’ and (1,3)”. 
The self-associated representation is I. Its characters become (1,z,7,1) 
and (1,y,7,1) where z and y obtained from (23) are (—1 + iV3)/2’since 
M = l, `M = 3, »=3. Writing e = exp(2ri/3), we thus have x = e, 
y = e. This completes the calculation as shown in Table 5. 


(15-23) 


TABLE 5 
Class (14) (1,3)’ (1,3)” (2?) 
No. of Elements 1 4 4 3 
py 1 1 1 1 
Ta 3 0 0 -1 
re) 1 € e 1 
T 1 P € 1 


15.14. The Unitary Group.—The collection of all non-singular matrices 
of order n, with matrix multiplication as the law of combination, is the 
representation of a group called the full linear group (FLG). The order 
of the group is infinite, for its elements are the infinite number of linear 
transformations that change a vector x into a new vector. This group has 
many subgroups obtained by imposing certain restrictions on the matrices 
of its transformations. Thus, we might exclude all matrices except those 

“with determinant equal to +1 or we might require that the matrices be 
orthogonal. Such groups are discrete, if the elements are infinitely denu- 
merable (an example of a discrete group of this type is given in seo. 15.1); 
continuous, if the elements are non-denumerable. An example is the 
group of rotations about an axis. One may also have mized-continuous 
groups such as R*(2) discussed in sec. 15.16. Infinite groups have many 
of the properties of finite groups, although naturally some modifications!® 
in their treatment are necessary. 


13 See, for example, Wigner, loc. cit., Chapter X. 
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We first consider a subgroup of FLG, which is called the two-dimensional 
unimodular unitary group (SUG, special unitary group). Its elements are 
square unitary matrices of order two with determinant of +1. Let us 
take a matrix 


and modify it so that these conditions are met. Referring to eq. (10-50), 
we see that we must have c = —b* and d = a*. Thus a typical element of 
SUG is 


a b 
=| o Af | U | = aa* + bb* = 1 (15-24) 
When this matrix is applied to a column vector x = { 21,22} so that 
Ux = x’, we have 
zi = azı + bzo 


-2 
za = —b*ri + ařro (15-25) 


It will also transform any function of x into a linear combination of 24, 
za; for example, 


Uf(x) = f(x’) = flax + bzg, —b*x, + a*r) (15-26) 
Thus if U operates on a set of (n + 1) homogeneous products 
fp) = mig (p = 0,1, 2,-- +n) (15-27) 


the result is a homogeneous polynomial of the same degree 
Uff’ = (ax, + bag)? (—b*x + a*r)" 


=E Up aes * . (15-28) 


Clearly, the two-dimensional matrices U are themselves representations 
of SUG. But the matrices with elements U% , being isomorphous with U 
because of eq. (28), must also be representations, provided we can show 
that they are unitary. As a matter of fact, they are not unitary, but if 
each element is multiplied by [p!(n — p) 7™?, they become so. Multipli- 
cation of the elements by this constant factor is, of course, equivalent to 
multiplying J by the same quantity. When we do this, we find it con- 
venient tosetn = 23; p =j-+m. The purpose of the latter substitution 
is to enable us to prove in sec. 15.15 that SUG is isomorphous with the 
three-dimensional rotation group. 
When these changes have been made, f$? becomes 

nk ae (15-29) 
"MG + m)1G— mi)! 
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where j = 0, D 1, $, o m= =j, —j + l, r 1, j. At the same 
time, eq. (28) becomes 

(azı + br) ™(—b*r, + a*r) 

AL ey AT a Ta Ta) O 


Uf) = 
© VG+mIG- m)! 
= x USS? (15-30) 


=~ 


The resulting matrices whose elements are UË?) will be indicated by U®, 
They are unitary and irreducible; furthermore, there are no other irreduci- 
ble representations!* of SUG. 

In order to obtain the elements of U®, we develop (30) by the binomial 


theorem and pick out the coefficient of f. It is found to be 


Uo = yV G+ mG- mGF OIG—9)! 
a” G-m-OlG+q-Ylt~qt mu! 
X ater ighimtyi-atmpkt (15-31) 


In this expression, ¢ takes the values 0, 1, 2, - - - and the summation breaks 
off automatically when negative powers of the a’s and b’s appear because 
the denominator will then contain the factor (~)! which is œ. 

Since m and g have (27 + 1) possible values, it follows that the matrices 
of the representations have dimensions of (2j + 1). Ibyg=0, v = 4, 
Ifj = 3, m and q can take the values +4, hence if the elements of the 
matrix are characterized by +4 and —4, in that order, we have 1/2) 
identical with U of (24). 

In order to determine the characters of SUG let us select a typical 
matrix of the group and transform it to diagonal form. A unitary trans- 
formation is required and it is certain that among the infinite number of 
unitary matrices in the group, one may be found, say V, that will effect the 
diagonalization 


1 a’ 0 
VU = D, = f vs | (15-32) 
Finally, since we require | Uy | = 1, the coefficients of Ui may be deter- 
. mined!* : 
e&t o 


All other matrices of the group belong to the same class as U and U, 
for the elass is composed of elements which are obtained from each other by 

1 The proof of these facts will be found in Murnaghan, loc, cit., Chapter 3 or Wigner, 
loc. cit., Chapter XV. i 

1E The reason for choosing e%/2 instead of ef% will become apparent in sec. 15,15. 
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similarity transformations or, in this particular case, by unitary transfor- 
mations. Since each matrix is unitary it remains unitary when it under- 
goes such a transformation. We also know that it is only necessary to 
calculate the character of one element from a class; thus using U, 


y2 = pidl2 + git 


is the character for the representation of degree (2j + 1) = 2. 

Now the matrices U, which are identical with U, when j = 4, must 
be transformable in such a way that the characters will be identical for 
j = $, and when this is done the characters should apply to SUG for any 
value of j. If we substitute a = e**/?, b = 0 in (31), the result is of diag- 
onal form since all elements disappear unless £ = 0 and m = q, 


U2 = eting (15-34) 
The required characters for SUG, infinite in number, are thus 
. io, 
x7 = È ee (15-35) 
m=} 


A simpler form of the last expression may be obtained as follows. Let 
p = e** go that 
. . D — 2 
xP = US 4 pt pha... 4 pt) = on Oe 


Multiply numerator and denominator by ¢‘*? and use the relation 
sinc = i(¢~** — e**)/2; then 


„o = Sn Bi + 18/2 


sin $/2 (15-36) 


The irreducible representations and characters satisfy certain orthog- 
onality and normalization conditions’? as in the case of finite groups, but 
the summations in (9) and (13) are replaced by integrals. 

15.15. The Three-Dimensional Rotation Groups.—Another important 
subgroup of FLG (as well as of the n dimensional unitary group) is the 
n dimensional full, real orthogonal group which consists of all unitary 
matrices with real elements. If we further restrict this subgroup, choosing 
all real unitary matrices with determinant equal to +1, we have the n- 
dimensional proper, real orthogonal group or the rotation group. It should 
be remembered that an orthogonal matrix need not be unitary but a real 
orthogonal matrix and a real unitary matrix are synonymous terms. For 
the moment, we consider the three-dimensional rotation group R*(3) whose 
elements are real orthogonal matrices of order three. 


46 See Wigner, loc. cit., Chapter XV or Eckart, Carl, Rev. Mod. Phys. 2, 344 (1930). 
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Assume that we have a sphere of unit radius, the center of which coin- 
cides with the origin of a coordinate system OX YZ fixed in space. Now let 
the coordinate of some point on the surface of the sphere be (x,y,z) and 
rotate the sphere in any manner whatsoever leaving its center fixed. The 
new coordinates of the point (2’,y’,z”) will be related to (2,y,z) by some 
matrix R(«,8,y) which is an element of R*(3). As we have shown in sec. 
9.5, such a rotation may be factored into a product of three plane rotations 
described by the Eulerian angles (a,6,7); i.e., we may write 


R(a,B,y) = R:(y)R.(8)R.(a) (15-37) 


where R, and R, are rotations about the Z- and X-axes respectively. 

In order to find the representations of R*(3) we could use a method 
similar to that of sec. 15.14 and study the effect of transforming a function 
of (x,y,z) by the elements of the group. A simpler method?” is available 
for we will show that R* (3) is isomorphous with SUG. Since we know the 
representations of the latter, we may use the same results for R*(3). We 
recall, however, that the elements of SUG are two-dimensional matrices 
while the elements of R? (3) are three-dimensional, hence the proof of the 
isomorphism depends upon finding some relation between these two kinds 
of matrices. The problem is an old one which occurred in classical mechan- 
ics; it was solved by Klein and by Cayley, who made use of a special kind 
of transformation in the complex plane.’® We prefer to proceed in another 
way. 

We first observe that any two-dimensional matrix may be written as a 
linear combination of the four matrices!® 


0 1 0 ~2 1 0 1 0 
P = Pi = . = . = 
i È i Be Be af Ps l mi P, l | 


(15-38) 
Ay | 
H= 
H Həd, 


CP) + eaPa + eaP + Ps 


Tor example, if 


we May write 


H 

where 
a= (Hys + Han)/2; co = i(Hi — Ha)/2 
ca = (Hy, — Hao)/2; e4 = (Hii + Ha2)/2 


1! Both methods are discussed by Wigner, loc. cit., Chapter XV. 

18 The details are given by Whittaker, E. T., “ Analytical Dynamics,” Third Edi- 
tion, Cambridge University Press, 1927, p. 12. The quantities a and b which appear 
in our eq. (24) are identical with the Cayley-Klein parameters. Eckart, loc. cit., and 
Bauer, loc. cit., have used a similar method in the group theory problem. 

19 The first. three of these are the Pauli spin matrices, discussed in sec. 11.29. 
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Let us take c, = 2, Co = Y; Cg = 2, ¢4 = 0. Then we have, 


A(2z,y,2) = cP, + yPo + zP 
- [ z a+ w] (15-39) 


z — iy —z 


Clearly if z, y, z are real, H is Hermitian. Moreover, its trace is zero; 
in fact, any two dimensional matrix with trace of zero may be put into this 
form, P, not being needed. If His now subjected to a unitary transforma- 
tion by the matrix U of (24) its trace is unchanged and we obtain 


H' (a'y!) = UHU = a'Py + y'Pz + 2'Ps (15-40) 


If we can prove that the relation between z, y, z and x’, y’, 2’ is a rotation, 
we may conclude that the matrices U of SUG perform the same transfor- 
mations as the matrices of the group R*(3) and that the two groups are iso- 
morphous. To do this, we note (see the problem in sec. 10.14) that 
| H | = | g’ |, bence 

P pH aa py? +2? (15-41) 
which means that the length of a vector is unchanged by the transformation 
of eq. (40) and the latter must be a rotation. 

Let us study some special forms of the matrix U whose general form is 
given by eq. (24). We first put a = e'@/?, b = 0, that is, we use Uj, the 
diagonal matrix of eq. (33). We easily find 

UIP, U, = cos aP + sin aPs 
UÎPaU, = — sin œP; + cos oP, (15-42) 
U|P3U, = Ps 


With these results, (40) becomes 


ii 


æ’ = g cosa -+ ysin a 
y’ = —z sina + y cos a 
z =z 


This clearly represents a rotation through an angle a about Z; it may be 
suitably represented by 
r’ = R,(a@)r 


where x’ and r are the vectors having components (z’y’,2’) and (x,y,z), 


respectively and ° 
cosa sina O 
R.(a) =| —sina cosa 0 (15-43) 
0 0 1 
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We have thus identified the element of SUG which corresponds to the last 
factor on the right of (37); it is Uj. Obviously, R.(y) corresponds to a 
matrix like (43) but with a replaced by y. 

In order to find R,(8) we take 


= eae oma (5-44) 
It is obtained by putting a = cos 8/2, b = isin 6/2 in (24). We now find 
UIP; U; = Py 
UPU = cos BP, + sin BP, 
USP3U, = — sin BP, + cos BP, 


and (40) may be written r' = R,(8)r; where 


1 0 0 
R.(8) =|0 cosg sin 8 (15-45) 
0 —sing cosg . 


Our notation, R,(@), is meant to exhibit the fact that (45) represents a rota- 
tion through 8 about X. Thus we have shown that by a proper choice of 
the elements of U, SUG and Rt(3) are isomorphous since U = 
Ui (y) U2 (8), (a) corresponds to R(a,8,7). 

Let us write U(a,B,y) = Uly) Ue(B)U; (a) 


E Hi 0 [e B/2 isin vel, 0 
0 eY? Hisin 8/2 cos 8/2.|LO g tal? 
B ae cos B/2 te tle) {2 sin ea] 


ie? gin 8/2 eH)? cos 6/2 (15—46) 


On comparing this with (24), we see that we havea = ¢i@+v)!2 cos 6/2 and 
b = ie“)? sin 8/2, so that (31) becomes 


-D tV GE mG mG EG g! 
yo _ y (=) j J ITY! 
o (bY) z G= m ONI +g AM — g m)l! 
X FB gost mta—2ig 19. sin” *+78/2 + eimy (15-47) 


As before, j = 0, 3, 1, 3,---. Forj = 0, we get U (a,8,¥) = 1; 
for 7 = 3, we obtain (46). It may be shown? that the matrices whose 
elements are given by (47) are irreducible representations and that there 
are no further ones. The characters of the representations are found from 


2 See footnote 17. 
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(35). Remembering that e+" = cos x + t sin z, they may be written as 
xP (a) =1+4+2cosa+---+2cos ja; if j =0,1,2,--- 
x (a) = 2 cos a/2 + 2 cos 3a/2 + --- +2 cos ja; 

if j = $, 4, aan (15-48) 


Although Rt(3) is isomorphous with SUG, the isomorphism is not 
simple. If 0 <a < 4r, 0 <8 <7,O0< y¥ < 2z, then as a, 8 and y take 
all values between these limits, a and b of (24) will take all pairs of values 
satisfying the requirement aa* + bb* = 1 once only. On the other hand, 
if a, 8 and y are Eulerian angles their limits are 0 < œ < 27,0 < 8B < r, 
O< y S 27. But the angles occur in (46) divided by 2, hence the trigono- 
metric functions are undetermined with regard to sign. In other words, 
every matrix R(a,8,7) is isomorphous with two matrices U(a,B,y). We 
must thus discard half of the representations of SUG in order to find the 
ones appropriate to R*(3). It is easy to see which ones we want. From 
(47) it follows that 

Una (a + 2,8yy) = US (a,8,7) 
Now when/ is integral, g is also integral, for —j < q < jand then e?"? = 1, 
If j were half integral, the identical rotations a and a + 27 would have 
representations differing in sign. However, R(a,8,7)U(e,8,7) for both 
integral and half-integral 7 values is a group which is isomorphous with 
U(a,8,7) and all matrices U® (a,8,7) are representations. This group is of 
importance in the Pauli spin theory.” 

If we take as elements of an infinite group, all real unitary matrices of 
order three with determinant equal to +1 as well as ~1, we have the 
three-dimensional full real orthogonal group R+(3). The quotient group 
isomorphous with it has two elements. The unit element, which is also an 
invariant subgroup of R*(8) contains Æ and all proper rotations R such as 
(43) or (45) with | R | = +1. The other element of the quotient group 
is an infinite number of improper rotations T with | T | = —1, a typical 
one (ef. sec. 10.17) being 

coso sing 0 
T.(@) =| —sinġ cosg 0 ~ (15-49) 
0 0 —i 


The simplest member of the class of T is an improper rotation by the 
angle r, the operation called inversion 


-1 0 0 
T(r)=T=| 0 —i 0 (15-50) 
0 0 =i 


21 Cf. Chapter 11. 
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It is always possible to find some improper rotation T which will convert 
any other improper rotation 7” into an inversion, T T’T = T , Just as it is 
always possible to find an inverse to a proper rotation, R'R’R = E. The 
group R*(3) may thus be considered as the direct product of R+(3) and the 
group I, the latter having elements E and J, It will have two irreducible 
- representations for every value of j, each being of dimension (27 + 1). 
The element R has two representations both equal to U“) while T has 
representations, + U”. 

15.16. The Two-Dimensional Rotation Groups.—The two-dimensional 
pure rotation group R* (2) is a subgroup of R*(3). Its elements are the 
proper rotations in a plane perpendicular to a fixed axis. Let R(@) be one 
of the elements where 0 < ¢ < 2r, then if x is a vector with components x; 
and zz, the element A(¢) may be represented by the matrix CC) 


x’ = C(¢)x; OLA Ln (15-51) 
with 
E coso sing| — 7 
C(¢) = | & b cos °| (15-52) 


If R(@’) is another element of the group, which is represented by C(¢’), then 
C (g)C(¢’) = Ce + 6’) = C')C@) (15-53) 


and the group is Abelian. Referring to sec. 15.11, we see that for such 
groups, each element is in a class by itself and the irreducible representa- 
tions are one-dimensional. Thus (52) is reducible, a unitary matrix of 
eigenvectors of C being required for that purpose since C itself is an or- 
thogonal matrix, The normalized eigenvectors of C are found to be 


u =} fii}; u = {1-1} 15-54 
1/2 tj: SPVA 1—1 (15-54) 


and the eigenvalues are e+**. These, then, are characters of an irreducible 
representation. However, there are an infinite number of classes, so there 
must be an infinite number of representations for each class. The corre- 
sponding characters may be taken as 


x = imo. m= O, +l, 2,- (15-55) 


for each will satisfy the multiplication requirement of the group, as indi- 
cated by eq. (53). 

The two-dimensional rotary reflection group RÈ (2) is composed of both 
proper and improper rotations. A typical element of it is represented by 
the matrix 
coso sin *| (15-56) 


À = 
(%,d) E sine dcos eo 
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where d equals the determinant of A(¢,d) and may be either +1 or —1. 
If d = +1, we have a proper rotation with matrix C(¢); if d = —1, an 
improper rotation, the matrix of which will be indicated by S(¢) 


_ | cos ¢ sin ġ 
S(@) = [e b —cos | (15-57) 
Clearly, 
S?(¢) = S(¢)S(¢) = E (15-58) 
but the group is not Abelian for C(¢) and S(¢) do not commute. In fact, 
A(¢,d)A(@’,d’) = A(d’e + ¢',dd’) (15-59) 


Let us reduce S(@) to diagonal form (cf. sec. 10.17). Its eigenvectors 
are found to be 


vı = {cos ¢/2, —sin ¢/2}; ve = {sin ¢/2, cos 6/2} (15-60) 


and the eigenvalues are +1. The resulting diagonal matrix, 


= f nm (15-61) 


which corresponds to a reflection through the axis of rotation, is that 
obtained from S(#) when ¢ equals Oor 2r. Itis interesting to observe that 
the matrix of eigenvectors, eq. (60), is actually a proper rotation by the 
angle ¢/2. Moreover, 


S(¢) = C*(¢/2)0C(o/2) (15-62) 


hence, an improper rotation is equivalent to a proper rotation by the angle 
/2, followed by a reflection and finally by a proper rotation of ¢/2 in the 
opposite direction. 

It will be remembered that every element of the group R* (2) is in a 
class by itself. It does not follow, however, that the proper rotations of 
R+(2) are each in a separate class. Thus the element represented by 
C(¢) is in the same class with the element represented by C’(@), since 

S(p CES) = C'S) 


C’(¢) -| 


where 
coso — sine] 
sing - coso 
and 
C' (p) = C~ (e) = C(-4) 
There are an infinite number of classes as before but each class contains the 


proper rotation by ¢ and the proper rotation by —@. 
Qn the other hand, all improper rotations are in the same class. If 
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S(¢) and S(¢’) are representations of two improper rotations, we find that 
C™(8")S(@)C(S") = SG’) (15-622) 


where ¢’ = ¢ + 2¢”. This could have been inferred from eq. (62), for it 
is a special case of (62a) when we set ¢’ = 4/2,¢ = Oor2rand¢! = ¢. 

If the representations of eq. (56) are transformed by the matrix of 
eigenvectors (54), the result is 


eme 0 
r (6,1) = o) = fo ine | (15-63) 
T (g1) = S™(g) = |? ene 15-64 
(¢,— )= $) = eime 0 ( ) 
with m = 1. However, when m = 0, 1, 2, --- the same matrices also 


satisfy the multiplication requirements of the group. They are irreducible 
except when m = 0. There, we obtain (see problem at the end of this sec- 
tion). 

C() = 1; SO) = 1 


COW = 1; She) = -1 05-65) 


A slightly different procedure is sometimes desirable. We see from 
eq. (62) that any improper rotation may always be written as a combina- 
tion of a proper rotation and a reflection. The elements of the group 
could thus-be considered as an infinite number of proper rotations and the 
single improper rotation which is represented by æ. When the latter is 
transformed by means of (54) we obtain 


1 
F(0,-1) = $ ] (15-64a) 
1 0 
Thus the irreducible representations are those of (63) and the single one of 
(642). When m = 0, we again get (65). 
Problem. Show that both (63) and (64) may be reduced to diagonal form with the 
. fl 1 
matrix [ r -1f 
15.17. The Dihedral Groups.—An important subgroup of R* (2) is 
obtained by restricting the values of ¢. Consider a regular polygon in the 
X Y-plane with coordinates of the n corners 


xe = r cos 2rk/n; yr = r sin 2rk/n; (k= 0, 1,2, n — 1) 


where r is the radius vector from the origin to the corner. Now if in (56) 
takes the value 2r/n, the matrix A(¢,d) will transform the polygon into 
itself by either a proper or an improper rotation. The elements of the 
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group will be indicated by C in the former case and by S in the latter. The 
corresponding matrices are A(2r/n,d) with the appropriate choice of d, 
but we will find it convenient again to use C and S for the matrices, dis- 
tinguishing between the abstract element and its representation by means 
of different type. The whole group is finite and of order 2n; it is called the 
dihedral group D,. It may be generated by the relations 


CŒ =E; & =E; SC=c'S (15-66) 


We now see why the group of sec. 15.1 was called D3. If we letn = 3 in 
eq. (66), we will have the generating relation of eq. (1), provided we 
reletter the elements C and S of (66) so that they read A and C, respec- 
tively. 

Suppose q is an integer; then we may write n = 2g + 1 if nis odd or 
n = 2 if nis even. There will be (q + 1) classes among the proper rota- 
tions for both n even and n odd. These will correspond to C, C?, C3, --., 
C2, C" = E. Forn odd, there will be one additional class, that of S. For 
n even, there will be two classes involving an improper rotation. The 
separation into classes for both cases is illustrated in the problem in this 
section. 

If n is odd, the classes for proper rotations will be represented by 
C© and C of eq. (65) and q matrices of (63) with m = 1, 2, --+, q. The 


TABLE 6 
nodd; g = (n—1)/2; ¢ = 2n/n 
C(B) Cc) tre ne (c°) C(s) 
To 1 1 rer wee 1 1 
yo 1 1 tae kee 1 —1 
po 2 2 cos ¢ cee eee 2 cos gd 0 
re 2 2 cos 26 eens 2 cos 2g% 0 
PO 2 2 cos gb Tae nee 2 cos gp 0 
neven; g=n/2 
i 
CE) ee) e eet) CC’) S) C&S’) 
To 1 1 Er 1 1 1 1 
TO) 1 1 vee 1 1 -l1 -=i 
To 1 —1 vee (—1)¢-! (-1)! 1 —1 
TO 1 —1 wae t—1)e} {—1)8 —i 4 
reo 2 2 cos ¢ +++ 2eos (g ~ 1) 2 cos go 0 0 
To 2 0 


2 cos 26 --+ 2cos2 u- De 2 cos 2g% 0 


TRY | 2 2 cos (g — 1) <- 2oos (g 1)% 2 cos qig — 16 0 0 
st Oe ee T ħħ 
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class of S is represented by the remaining one-dimensional matrices of (65) 
and g matrices like (64) or (64a). If n is even, the situation is similar, 
except for the case m = n/2 when (63) and (64) become 


C2) (r) = E B 5 SMD) (rn) = S l (15-67) 


0 1 0 
This representation may be reduced to give C® = —1; S% = 1; 
c@ = —1; S@ = —1. Hence, when n is even there are four repre- 


sentations of degree one and (q — 1) of degree two. The characters of 
dihedral groups are shown in Table 6. 

Problem. Show that if n = 6, the classes of the group are C(E) = E; C(C) = C, 
05; @(c%) = 0%, ct; CC?) = C8; CS) = S, C*S, C48; CS’) = CS, CIS, CBS. 
If n = 5, show that the classes are C(E) = E; C(C) =C, Ct; C(C?) = C2 C$; 
C(8) = S, CS, C48, CES, CS. 

15.18. The Crystallographic Point Groups.’*—By considering all opera- 
tions which transform certain solid geometric figures into themselves, we 
obtain a number of finite subgroups of R*(3), called the crystallographic 
point groups. ‘They are of considerable importance in the study of crystal 
and molecular strueture. We assume that one point of the figure is 
fixed in space so that if we know the position of two more points which are 
not collinear with the fixed point, the position of the figure is completely 
determined. Under these conditions, the only possible types of motion are 
rotations around an axis passing through the fixed point and reflections in 
a plane containing that point. All other motions may be reduced to a 
combination of these two, for as we have seen in sec. 15.16 any improper 
rotation may always be written as a product of two proper rotations and a 
reflection. When the improper rotation is an inversion (i.e., improper 
rotation by the angle r) a point will be collinear with its original position 
and some fixed point on the axis of rotation, hence an inversion is uniquely 
determined by the position of this fixed point and is independent of the 
position of the axis. The fixed point is called a center of inversion. 

We thus have four fundamental operations: (a) a proper rotation Ca 
by an angle ø = 27/n (n is an integer) about an n-fold axis of rotation; 
(b) reflection in a plane, indieated by o4, oa, oy (subscripts h, d and v refer 
to horizontal, diagonal and vertical planes); (c) an improper rotation, Sn; 
(d) inversion, indicated by J. 

Selected sets of such operations, together with a unit element which 
leaves every point of a figure unchanged, are the elements of the erystallo- 

22 More details about the crystallographic groups are given by Schoenflies, A., 
“Theorie der Kristallstruktur,” Gebrüder Borntraeger, Berlin, 1932, and in other 

references cited later in this chapter. The geometric arguments given here, which we 


do not prove, are discussed in detail by Schoenflies; see also, Rosenthal, J. and Murphy, 
G. M., Revs. Mod. Phys. 8, 317 (1936). 
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graphic groups. The number of these groups which is of interest is 

limited by the fact that we need to consider only those types of symmetry 

which occur in crystals or molecules. It may be shown, from geometric 

arguments, that crystals in nature may have axes of rotation only for 

n = 1, 2, 3, 4, 6, and this fact restricts the total number of crystallographic 

point groups to 32. Some gaseous molecules may exist with axes n = 5, 7, 

& and the appropriate character tables are readily found. For the linear 

molecule without a center of symmetry, like CO or HCl, the symmetry 

group is Cas, isomorphous with D,, and R*(2). If a linear molecule has a’ 
center of symmetry, like Ha or CoHs, the groupis Dun. 

The crystallographic groups may be generated in an elegant way from 
group theory considerations but we present the results without proof. 
Consider first the cyclic groups, designated by C, (n = 1, 2, 3, 4, 6) and of 
order n. A new group of order 2n may be obtuined by adding n two-fold 
axes of symmetry to C,, in a plane perpendicular to the principal n-fold 
axis of the cyclic group. These are the dihedral groups, D,, (n = 2, 3, 4, 6), 
but only four in number since D, duplicates Cs. Two more, containing 
proper elements of symmetry only, are the cubic groups, T of order 12 and 
O of order 24, having the symmetry of the tetrahedron and the octahedron, 
respectively. , 

Of the required 32 groups, 11 have now been found. They contain 
nothing but proper elements of symmetry, so the remaining 21 groups must 
contain both proper and improper elements, which could be planes (im- 
proper rotations by the angle zero), the inversion (improper rotation by 7), 
or rotary reflections (rotation by the angle 2x/n, followed by a reflection in 
a plane perpendicular to the axis of rotation}. Let us first add horizontal 
planes of symmetry, e» to the groups C, and require that they be perpen- 
dicular to the principal axis of the proper rotation. ‘The results are Cp» 
(n = 1, 2,3, 4, 6) of order 2n, but when n is even a center of symmetry also 
éxists and the group could also be written as the direct product, Caa = 
C. XxX I. When o; is added to Dn, we get Dan (n = 2, 3, 4, 6) and again for 
n even, Dz, = Da X I, but n = 1 duplicates Cza. The two cubic groups 
T, and O} also have centers of symmetry. 

Now add vertical planes of symmetry, -oy to Cn, througn the n-fold axes 
to get Cr» (n = 2, 3, 4,6) of order 2n. When n = 6, the group is isomor- 
phous with Daa. When n = 1, duplication occurs since Cis and Cy, are 
identical in configuration, differing only in orientation. Addition of verti- 
cal planes of symmetry to D, adds nothing new, for the planes would coin- 
cide with the existing two-fold axes. However, if the added planes are 
diagonal, oz and if they bisect the angle of the two-fold axes, we get Dra 
(n = 2, 3) and Ty When n = 2, the group is isomorphic with Cav; 
when n = 3, Dza = Da X L; n = 4 or 6 would require 8-fold and 12-fold 
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axes of symmetry, hence they are impossible for crystals; n = 1 duplicates 
Coy. 

Three more groups will complete the list of 32. Let us try improper 
rotations, S, but with n even, for an n-fold improper axis implies a proper 
axis, Cp (p = n/2). We are thus limited to n = 2, 4,6. When n = 2, 
S: = Ci X I, usually designated C;; similarly, Ss = C3 X I = Cy; 
There is no center of symmetry, however, for S4. 

For convenient reference, these results are collected in Table 7. The 
first column contains the proper groups (G). The cyclic groups are of 
order n; the dihedral groups of order 2n; T is of order 12 and O of order 24. 
Improper groups (G) on the same line in the table with a proper group (G) 
have the same order as (G) and the groups are isomorphous with each 
other. An improper group (G) X I, which is the direct products of (G) 
and I, has an order twice that of (G). Other relations between the 
various groups Would have been apparent if they had been derived in other 
possible ways. ‘These relations may also be found by study of appropriate 
solid models or plane diagrams. Stereographic projections of the solid 
figures are suitable for such a study.” 

The symbols given in Table 7 are those generally used in molecular 
problems and devised by Schoenflies. Some alternative, but lesser used 
symbols, are shown in parenthesis. The dihedral group, Do, for example 
is often called V for “ Vierergruppe.”’ It is that of the Cartesian coordinate 
system with three mutually perpendicular two-fold axes. The meaning 
of the other alternative forms will be obvious. Crystallographers, un- 
fortunately, have used a bewildering variety of systems’* for designating 
the groups. 

In the tables at the end of this section we present the characters for 
all of these groups. It is convenient to indicate a class by means of sym- 
bols like Ca, Sn or c, a typical element of it. If a number precedes the 
symbol it is the number of elements in that class; otherwise the class in 
question contains but one element. Representations of degree one are in- 
dicated?” by A or B; of degree two by Æ (except for certain cases, where 
two one-dimensional representations occur in pairs); of degree three by T. 


* They are given, for example, by Eyring, H., Walter, J., and Kimball, G. E., 
“ Quantum Chemistry,” John Wiley and Sons, Ine., New York, 1944. Easily under- 
stood perspective drawings of the group symmetries may be found in Davey, W. P., 
“A Study of Crystal Structure and Its Applications,” MeGraw-Hill Book Co., Ine., 
New York, 1934. 

4 See Davey, loc. cit. for a discussion of these nomencelatures. 

23 Further description of the designation of representations, especially the usage of 
molecular spectroscopists, is given by Herzberg, G., ‘ Molecular Spectra and Molecular 
Structure, Vol. II, Infrared and Raman Spectra of Polyatomie Molecules,” D. Van 
Nostrand Co., Ine., New York, 1945. 
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When two one-dimensional representations A and B occur in the same 
group, it will be found that the character of A is + 1 for the class represent- 
ing rotation by 2r/n around the principal n-fold axis and —1 for B. The 
principal axis is always taken in the direction of Z. Different representa- 
tions of similar symmetry to refiection in a plane perpendicular to the 
principal axis are indicated by ’ and '’ while subscripts g and u refer to 
positive and negative characters for the class of T. 


TABLE 7 


THE CRYSTALLOGRAPHIC POINT GROUPS 


Proper Groups Improper Groups 


(G) (G) | (@) x1 

Cyclic Groups 

Ci — C:(S2) 

C Cir (Ca) Car 

C; — Cai (S6) 

Cs Sa Car 

Ce Cu, Can 
Dihedral Groups 

De(V) Cry Dox (Vr) 

Dz Cay Dza 

Da Cu; Doa(Va) Dan 

De Coo; Dap, Dep, 
Cubic Groups 

T _ Th 

o Ta Onr 
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Methods of finding the characters”® for cyclic and dihedral groups 
have already been described in detail. The cubic groups O and T are the 
symmetric and alternating groups on four letters; they have been discussed 
as examples of permutation groups. The remaining groups, which are 
indicated as a direct product will have twice as many classes and repre- 
sentations as appear in our tables. Each representation given there will 
occur once with the subscript g and once with u (except for C3, where the 
representations are A’, A”, B’, E”). For example, C3; = C3 X I will 
have classes Æ, Ca, C3, I, IC}, IC}. The classes which are found in Cg; 
will have the same characters as C3, once as g and once as u while the new 
classes will have the same characters as Ca for g-representations and the 
negative of those for w-representations. Groups having the same character 
table are isomorphous. . 

For convenience of reference, we also include the infinite group D. 
which is isomorphous with both R*(2) and C «s and the group Dan = Da 
x I which is isomorphous with R* (2). 

One further question of interest here concerns the transformation 
properties:of a vector when subjected to the operations of a crystallo- 
graphic group. We have shown, in sec. 15.15, how a vector is transformed 
by the elements of the group R* (3). The representations from which this 
effect is immediately seen are given by (43) and (49), the characters of 
which are , 

Er = 1+ 2cosọ; Zp = —1 + 2 cos ọ 
The same characters must also apply to the crystallographic groups since 
they are subgroups of R*(3), but it does not follow that the characters 
remain irreducible. As an example, consider the group C, where all of the 
classes involve proper rotations. The angles for the classes of Æ, C2, Cy, CÌ 
are 0, r, 7/2, 37/2, hence Zp = 3, —1, 1,1. Comparison with the charac- 
ter table for C4 shows that these numbers are the sums of the characters for 
the representations A and Æ. The reader should draw a figure of the 
appropriate symmetry which in this case is a square. Let the Z-axis be 
perpendicular to the plane of the paper, then it is immediately obvious that 
z transforms like A for z is unchanged by the operations of the group. 
When the operation C2 is applied to the figure (i.e., rotation by r) x is 
transformed into —z and y into ~y, hence x + iy becomes — (r + ty). 
Proceeding in this way with the other elements of the group, it will be 
seen that x ++ iy transforms like the first set of characters for E in Table 7 
and x — ty like the second set. For S4, the last two classes are improper 
rotations with Zp = —1, ~—1, hence the reducible characters are 3, —1, 
~1, —1 or B + E; z transforms like B and z + iy like E. We have indi- 


26 The reducible representations of all of the crystallographic groups are given by 
Seitz, Z. Kristallographie, A88, 433 (1934). 
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cated all of these transformation properties in our tables. When two or 
more groups are isomorphous and the representations are the same (exam- 
ples, Dy, C4, and Dos or C4 and S4), the characters for the coordinates refer 
to the first group of that table. To obtain them for the other groups, one 
must change the sign of the characters for the improper rotations, for 
example, z transforms like 42 for Dy, like A; for Cap and like Bz for Dea. 


TABLE 8 


CYCLIC GROUPS 


Cy E 
Ay zyz i 
C: E I 
Ce E Co 
Cir E on 
anmenm 
Ay A; z A’; xy 1 1 
Au} Xyz B; ay A”; 2 -1 
C3 E Cs c? 
cen 
Aj; z 1 1 1 
. 1 ad € 
E; tty f e t 
e= g2% 
Cy, = Cz X on; Cy, = Cg XT 
C; E C2 C4 C4 
Sa E Ce S4 Sy 
nen 
A; 2 1 1 1 1 
B 1 1 ~1 —1 
l —i —t 
E; x + ty G -1 i ai 
Ca = CXI 
Ca E Ce C3 Ca c c3 
iaa O 
A; 2 1 1 1 1 1 1 
B 1 1 1 -1 1 -1 
1 ee E 1 -—e* -E 
Fa fi -e _* 1 —e e" 
. 1 * E —1 —e* € 
Ei; tty i e ë —i Le < 
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E c} cy ci 
Cry E Ca oy oi 
Cas E Cy Oh I 
At Aj 2 Ay 1 1 1 
Bs; x Bay | Bo 1 -l ~1 
By z An Au} 2 1 1 -1 -1 
Bz; y By z Bu ta ty 1 —~1 1 —} 
Da =D: XI 
D: E 203 act 
Cae E 203 30, 
Al L 1 1 
Az 2 i 1 -j 
E; z+iy 2 -1 0 
Du =D, x I 
Dy E Cs 2C4 2C3 207 
Cuy E Ce 204 2o, 2og 
Dea E Cy 284 203 2o4 
Ay l 1 1 1 l 
Az; z „i l 1 ~1 -1 
By 1 1 -1 4 -l 
By 1 1 1 —İ l 
E; city 2 —2 0 0 0 
Du =D, xI 


Ds 


By ziy 


Ey zi 
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TABLE 8 (Continued) 


prgepRaL GRrours (Continued) 


Da E 2C (@) Ce 
Cay E 2C ($) Ov 
o aaaea 
Åi; 2 l 1 1 
Aa 1 l —l 
Ei, x + ùy 2 2 cos $ 0 
Ee 2 2 cos 26 0° 
Ex 2 2 cos kẹ 0 
Dor = Be x I 


CUBIC GROUPE 


T lE 3C2 Cs 4C4 
me 
A 1 l 1 l 
1 1 € e* 
E t 1 & € 
F; zyz 3 -1 0 0 


o E 8C3 3C2 6Ce2 6C: 
Ta E 8C3 3C2 Bord 684 
A 1 1. 1 1 1 
Ag 1 1 1 -1 —1 
E 2 —t 2 0 0 
Ty 3- 0 —1 1 —1 
Tij Dt? 3 0 -l1 —1 1 
Ors OxI 


15.19. Applications of Group Theory.—Since group theory is concerned 
with symmetry properties, its mathematical methods should be useful in 
many physical problems. Its most obvious application consists in the 
classification of crystals and polyatomic molecules according to a group of 
the appropriate symmetry. Tt is natural to inquire whether group theory 
may also be used in quantum mechanics. For systems containing a num- 
ber of particles, calculations by the usual methods are difficult; hence it is 
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fortunate that the symmetry of such problems can be utilized to some 
extent in their study.?” 

The Schrédinger equation for a system of n identical particles (elec- 
tron, protons, etc.) may be written as follows: 


H(1,2,-- + n)¥(1,2,---n) = EV(1,2,---n) (15-68) 


where the numbers 1, 2, --- n appearing in both the Hamiltonian operator H 
and the state function » indicate that these quantities depend on the 
coordinates of particles 1,2,---n. Now it is clear that, if the coordinates 
of particles ¢ and j are interchanged everywhere in eq. (68), the latter 
remains valid, for such an exchange amounts to no more than a relabelling 
of the particles. This fact might be thdicated formally by applying to (68) 
the operator P;; defined as effecting an interchange of particles 7 and 7: 


PyjH(A,2,: n nP (1,2, ‘ a) = EP (A,2,- n n) 


But the functional form of H is unaltered when P,; 1s applied, regardless 
of its specific form, provided the particles are identical, hence this equation 


reads 
HPs = EP spf 


In other words, P;, is also an eigenfunction of H, and one belonging to the 
same eigenvalue E. 

Now P;; is an element of the symmetric group on n particles. There- 
fore the state of affairs described above is usually expressed by saying that 
the Schrédinger equation is invariant under the symmetric group. Permu- 
tations are not the only operations under which the wave equation is invari- 
ant. Suppose the nucleus of an atom is considered as a fixed field of force; 
then rotations and reflections at this point leave the energy of the system. 
unchanged (i.e. the operator H is invariant with respect to them). The 
groups in question are those of sec. 15.15. If the atom is in a homo- 
geneous electric or magnetic field, the appropriate group is the subgroup 
of rotations about a fixed axis (see sec. 15.16). For a diatomic molecule, 
the two nuclei are the centers of force (as a first approximation) and the 
groups are those of rotation about, and reflection in a plane through, the 
line joining the nuclei. If the nuclei are identical (as in hydrogen, oxy- 
gen or nitrogen) reflections in a plane perpendicular to the internuclear 
line (i.e., exchange of the nuclei) must also be included. For a polyatomic 
molecule, the potential energy has the same symmetry as the molecule 
itself, hence the wave equation is invariant to some one of the crystallo- 

27 Such usage has been discussed in detail, especially. by Wigner and Wey! in refer- 
ences cited at the end of this chapter. Many of the books listed in Chapter 11 on 
quantum mechanics, particularly that of Dirac, avoid the formalism of group theory but 
obtain equivalent results by a more physical procedure. 
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graphic groups. These examples should be sufficient to indicate the kinds 
of groups which are of importance in quantum mechanical problems. Each 
case must be studied individually and all groups under which the particular 
Schrödinger equation to be studied is invariant must be taken together to 
form the complete group of the Schrödinger equation. ` 

As a simple example of the method, consider the one-dimensional wave 
equation”® 


2 2 ` 
|- t a+ va} vee) = Wyl) . (15-69) 


where the potential energy is of such a form that 
Viz) = V(-2) 


and where the energy state W is non-degenerate, as is nearly always true in 
such one-dimensional problems. Suppose Pr is an operation which 
replaces « by —z wherever it occurs in (69). Then 


Prle) = (2) = ¥(-2) 


the result being a new ¥-function, y’, which has the same value at z as the 
old one, y, had at —z. The new ¥-function will, however, satisfy the wave 
equation as well as the old one, with the same value of W. Hence it must 
be a constant multiple of y (x), i.e., y’ = op., But if both y and y’ are to be 
normalized, c can only be +1 or — i. This result recalls the well-known 
fact that all eigenfunctions of eq. (69) are either even or odd functions of =. 

To exhibit the connection with group theory we note here the f ollowing 
facts which will be illuminated subsequently. Let Px be an operator that 


replaces x by itself. Then 
Pev(z) = ¥@) 


Combining Pg with Pr we obtain a group, 

PiPg = PePr = Pr; P} = Pg 
which is isomorphous with C; (sec. 15.18), and others mentioned in pre- 
ceding sections. It has two irreducible representations both of degree one 
(see Table 8). The representation A, corresponds to even eigenfunctions 


and A, to odd ones. l 
Next, let us suppose that the Hamiltonian operator is invariant to & 


group of linear substitutions, such as R, S, ete., and that the y-function 
depends on n coordinates zy -'' Zn. These may be combined to form & 
vector x. If, then, 
x’ = Rx 
28 We now use W for the eigenvalue in this section, reserving Æ for the unit element 
of a group. 
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we may define an operator Pg which changes y(x) into ¢ (x^): 
Prý (x) = ¥(2’) 
Now consider two cases: 


a. y(x) is non-degenerate. Since, from the invariance of H, Ppy 
must also satisfy the Schrédinger equation with the same W, it must be 
identical with y (except for a constant multipler). 


b. (x) belongs to an eigenvalue W which possesses an a-fold degener- 
acy. We may then label the a linearly independent functions 


Fis Yar) Wa 


The effect of Pz on ¥ will then be to convert it into a linear combination of 
Yi for such a combination is the most general function belonging to W., 
Hence 


Pry = DvD (Ry 


D(R) being a certain matrix associated with the operator Pp. Similarly, 


Psh = EVD (S)a 


and 
PoP ppi = >a VD (SYD (Rri 


= EvID(S )D(R)]a * (15-70) 


From sec. 15.7, it should be clear that the matrices whose elements appear 
on the right of eq. (70) are representations of the group of the operators 
Pr, Ps. The dimension of each representation equals the number of 
linearly independent ¥-functions, hence it is also equal to the degeneracy 
of the eigenvalue. If the original set of ¥-functions were not linearly inde- 
pendent the resulting representations would be reducible. When the com- 
plete set of irreducible representations is obtained we see that each one 
would correspond to an eigenvalue of the quantum mechanical problem. 
The value of group theory in quantum mechanics is thus evident. From 
the symmetry of the system and without solving the wave equation we 
may obtain the possible number of eigenvalues and the degeneracy of each. 
Moreover the eigenvalues may be classified with regard to the particular 
representation to which it belongs. This fact is of considerable interest 
to the spectroscopist in studying the possible number and the symmetry 
of the energy levels to be expected in a given case. For example, as indi- 
cated in an earlier paragraph of this section, the group for the diatomic 
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molecule is R*(2) of sec. 15.16. This has an infinite number of represen- 
tations m = 1, 2, 3, --- and two representations form = 0. These corre- 
spond to the electronic energy levels?® H, A, ®, +--+ form = 1, 2,3, 
and 2+, £7 for m = 0. 

The selection rules for allowed transitions in atomic and molecular 
spectroscopy may be determined readily from the symmetry alone. As 
shown in sec. 11.28 these depend on the matrix elements of the electric 
moment. The latter is itself a vector and its components will transform 
under the operations of the group like z, y, z or some combination of these 
components as shown in Table 8 for the various symmetry groups. The 
y-function of a given state will also belong to some irreducible representa- 
tion of the group. ‘The product of a component of the electric moment and 
the y-function will transform like the direct product of the representations 
for each. This direct product will often be reducible and when reduction 
is effected, the result will be a sum of representations of the symmetry 
group. ‘Transitions are allowed only to these states. Actually it is not 
necessary to know the representations themselves as a knowledge of the 
characters alone is sufficient. The reader should refer to other sources? 
for the details of the theory. A simple example will show how the method 
is used. 

Suppose a given energy level is known to have a ¥-function which 
transforms like E, in the group Dg. Then for an electric moment along z, 
the direct product of the characters of Ay and Ey is 2, 2, —1, —1, 9, 0, 
hence the only allowed transition from E, is to another state of the same 
symmetry. If the component of electric moment (x -+ ty) is of interest, 
the characters are those of E, times Es or 4, ~4, 1, — 1, 0, 0 which is a sum 
of characters for B,, Bz and E;. Transitions are allowed from £z to either 
B, By or E; but to no others for the (x + iy) component of electric moment. 

Selection rules for the Raman effect depend in a similar way on the 
transformation properties of the polarizability tensor. Its characters are 
2 + 2 cos @ + 2 cos 26, the upper. sign referring to & proper rotation and 
the lower sign to an improper one. 

As shown in sec. 9.10, the instantaneous position in space of a poly- 
atomic molecule containing n atoms is specified by 3n coordinates. Three 
of them locate the center of gravity of the molecule and are thus associated 
with translational motion. Three more (or two, if the molecule is a linear 
one) describe orientation relative to principal axes of mertia and the motion 


29 These are the customary symbols in molecular spectroscopy; 56°, for example, 
Herzberg, G., “ Molecular Spectra and Molecular Structure; Diatomic Molecules,” 
D. Van Nostrand Co., Inc., New York, 1950. . 

30 See, for example, Eyring, Walter, and Kimball, loc. cit. or Meister, A. G., Cleve- 
land, F. F., and Murray, M.J., Am. J. Phys. 11, 239 (1943). 
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is rotation. The remaining (8n — 6) or (3n — 5) coordinates are de- 
scriptive of internal motions or vibrations. Now the latter, as well as the 
three translations, transform like vectors and the activity of the vibration 
in the infrared or the Raman effect may be determined as we have indicated. 
The transformation properties of rotation are like that of angular momen- 
tum and from eq. (9-19) it may be shown that the reducible characters are 
1 + 2cos ¢, the upper sign again referring to proper rotations and the 
lower sign to improper ones. Use of these transformation properties makes 
it possible to predict in considerable detail the spectroscopic behavior of 
the polyatomic molecule, provided its symmetry is known or a reasonable 


one assumed. 
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formulas involving, 121 
Bessel’s differentia] equation, 74, 221, 
235, 540 
integral, 116 
interpolation formula, 470 
B-function, 97 
Bilinear form, 317 
Binomial coefficients, 433 
.. addition theorem of, 434 
theorems on, 434, 435 
Binomial expansion, 433 
theorem, 113 
Biot, 252 
Bipolar coordinates, 187 
Birge, 492, 504, 518, 519 
Bliss, 215 
Bloch’s theorem, 81 
Bocher, 72, 313, 328, 543 
Body, deformable, 169 
elastic, 169 
rigid, 282 
Bohm, 429 
Bohr’s frequency condition, 402 
radius, 131 
theory, angular momentum, 340 
Bolza, 215 
Born, 27, 87, 328, 429 
Bose-Einstein statistics, 462 
Boundary conditions of differential 
equations, 269, 520, 537 
Brachistochrone, 202 
Bridgman, 15, 17 
Brillouin, 171, 429 
Browne, 11 
Brunt, 618 
Buch, 519 
Buchdahi, 27 
Burckhardt, 586 
Burington, 59, 156 
Burnside, 586 
Byerly, 215, 249 © 


Cable, hanging under own weight, 58 
Calculation, algebraic, 491 
Cambi, 136 
Campbell, 252 
Canonical ensemble, 446, 448 
equations, Hamilton's, 284, 443 
Canonically conjugate 
variables, 284 
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Capacitance, 44 
Caratheodory, 13, 27 
principle of, 26 
Carroll, 17 
Carslaw, 266, 267 
Cartesian components, 101 
Cartesian, coordinates, 172, 177, 338 
expansion for divergence, 150 
for scalar product, 142 
for scalar triple product, 147 
for vector product, 143 
system, 553, 576 
Casimir, 586 
Catenary, 58 
Cauchy relations, 41, 89 


. Cauchy’s integral theorem, 259 


Cauchy-Riemann, 90 
Cayley-Klein parameters, 287 
Central difference formulas, 467 
Central field motion, quantum treat- 
ment, 363 
Centrifuge, 38 
Chain, sliding over peg, 51 
Chapman, 431, 466 
Characteristic equation, 7 
of a matrix, 318 
functions, 246 
roots of a matrix, 319 
values, 246 
Characters of a representation, 554, 578 
tables of, 579 
Charged cylinder, potential near, 225 
Chasles’ Theorem, 275 
Chemical analysis, indirect, 314 
Chemical potentials, 25 
Christoffel three-index symbol, 167, 
196 
Churchill, 136, 267 
Circular membrane, vibrations of, 254 
Clairaut’s equation, 47, 48 
Clapeyron’s equation, 38 
Classes, 432, 547 
Classical physics contrasted with quan- 
tum mechanics, 333 
Classification of eigenvalues according 
to irreducible representations, 
584 
Clausius, 11, 24, 38 
Cleveland, 585 
Closed systems, 394 
Coefficient of diffusion, 238 
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Coefficients, detached, 498 
Cofactor of determinant, 303 
Cogradient variable, 318 
Colatitude, 177 
Collar, 282, 332, 498, 499, 502, 503 
Collatz, 519 
Collineatory transformation, 317 
Combinations, 431 
of three vectors, 146 
with repetitions, 433 
Commutability of operators, 337, 338 
Compatible measurements, 338 
Complementary function, 53 
Complete differential, 82 
solution, 32 
Completeness of a set of functions, 248, 
249 
eigenfunctions, 344 
Complex integration, 89 
of a group, 548 
Components of a tensor, 162 
Components, of thermodynamie sys- 
tem, 26 
of-a curvilinear vector, 174 
of a vector, 137 
Composite functions of del, 153 
Conditions on state functions, 336 
Condon, 358, 422, 429 
Conducting sphere, in field of point 
charge, 226 
potential near, 224 
Conductivity, thermal, 35 
Configuration space, 336 
Confocal ellipsoidal coordinates, 178 
paraboloidal coordinates, 184 
Congruent transformation, 317, 322 
Conical coordinates, 183 
Conjugate variables, 284 
elements of a group, 547 
subgroup, 548 
Conjunctive transformation, 317 
Conservation of density-in-phase, 444 
Conservative system, 205, 281 
Constraints, 283 
Continuity, equation of, 152, 217, 
399 
Continuous group, 562 
Continuous spectrum, 402 
of eigenvalues, 336, 340 
Contraction of tensors, 166 
Contragradient variable, 317 
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Contravariant tensor, 163 
vector, 162 
Convolution, 261 
Coolidge, 424, 503 
Coordinate axes, 172 
line, 172, 192 
surface, 172 
Coordinate systems, orthogonal, 173 
non-orthogonal, 192 
n-dimensional, 312 
Coordinates, bipolar, 187 
Cartesian, 172, 177 
confocal ellipsoidal, 178 
confocal paraboloidal, 184 
conical, 183 
curvilinear, 172 
cylindrical, 178, 191 
ellipsoidal, 179 
elliptic cylindrical, 182 
generalized, 443 
normal, 292, 326 
oblate spheroidal, 182 
of Lagrange, generalized, 283 
Coordinates, parabolic, 185 
parabolic cylindrical, 186 
prolate spheroidal, 180 
relative, 411 
spherical polar, 177, 191 
tensors in curvilinear, 192 
toroidal, 190 
Corbin, 282 
Corson, 430, 586 
Coset, 548 
Cosines, direction, 138, 173 
Cotes, formula of Newton and, 476 
Coulomb field, motion in, 365 
Coulson, 245, 430 
Courant, 267, 270, 279, 281, 379, 528, 
543 . 
Covariant derivative of tensor, 169, 196 
Covariant tensor, 163 
vector, 163 
Cowling, 431, 466 
Craig, 171 
Cramer’s rule, 313 
Crawford, 503 
Cross, 300, 503 
Cross product, 143 
Crumpler, 504 
Crystallographic point groups, 574 
Cubic groups, 575 
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Cur! in curvilinear coordinates, 175 
in tensor notation, 197 
of a vector, 152 
Current in quantum mechanies, 399 - 
Curve fitting, errors in, 518 
Curvilinear component of vector, 174 
Curvilinear coordinates, 172 
tensors in, 192 
Cycle, 
of permutation, 558 
thermodynamie, 8 
Cyclic group, 543, 557 
Cyeloid, 203 
Cylinder, potential of charged, 225 
Cylindrical coordinates, 178, 191 
elliptic, 182 
parabolic, 186 


Damped harmonie motion, 51 
Darling, 299 
Darwin, 452, 459 `- 
Darwin-Fowler method, 452 
Davey, 576 
De Broglie, 429 

formula, 398 

wave length, 352 
Decius, 300 
DeFay, 31 . 
Definite form, 317 
Deformable body, 169 
Degeneracy, 296 

factor, 458 

due to particle exchange, 415 
Degenerate eigenvalues, 273 
Degree, of a cycle, 454 

of a differential equation, 32 

of a group representation, 550 
Degrees of freedom, 26, 282 
Del, 150 . 

composite functions of, 153 

successive applications of, 153 
Delta, Kronecker, 163, 308 
Deming, 519 
De Moivre’s theorem, 557 
Density, flux, 152 
Density-in-phase, 444 
Derivative, covariant, 196 

directional, 150, 174 

of tensor, 167 

of tensor, covariant, 169 

partial, 2 


Derivative, thermodynamio, 15 
Deviation, standard, 511 
Descents, method of steepest, 459 
Desch, 26 
Determinant, cofactor of, 
complementary minor, 303 
definition of, 302 
differentiation of, 305 
expansion of secular, 500 
Gram, 135 
Laplace development of, 311 
multiplication of, 303 
numerical evaluation of, 499 
numerical solution of secular, 500 
properties of, 311 
roots of a secular, 500 
solution of simultaneous equations 
~ by, 497 
value of, 302 
Wronskian, 135 
Diagonal matrix, 298 
Diagonalization of a matrix, 319, 331 
Diatomic molecule, 360, 582 
Difference, definition of, 468 
divided, 470 
formulas, central, 470 
of tensors, 164 
of vectors, 141 
table, 468 
Differential, complete, 82 
Differential, exact, 1, 8, 82, 156 
higher order, 5 
incomplete, 82 
inexact, 82 
Differential and integral equations, 
relation of, 582 
Differential equation, 
Fermi, 491 
numerical solution of, 482 
partial, 216 
Differential operator, 267 
in tensor notation, 195 
Differentiation, 261 
Differentiation, by polynomial method, 
473 
numerical, 472 
of determinants, 305 
of tensor, 167 
of vectors, 148 
partial, 1 
with interpolation formula, 472 


of Thomas- 
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Diffraction of waves, 234 Eigenvalue, of a kernel, 527 
Diffusion, differential equation of, 237 of a matrix, 319 
Guantum mechanical, 397 Eigenvectors, 319, 405 
Dimension of a group representation, | Einstein-Bose, statistics, 456 
550 Eisenhart, 176 
Dingle, 335 Elastic body, 169 
Dipole, potential due to, 228 Electric displacement, 152 
Dipole moment, 56, 101 flux, 151 
waves, 236 polarization, 161 
Dirac, 254, 341, 429, 582 Electricity, 171, 190 
Dirac é-function, 239, 341 Electrodynamics, 180 
Direct product of groups, 556 Electromotive force, 42 
sum of representations, 551 Electron, 418 
Direction cosines, 138, 173 spin of, 402 
Directional derivative, 150, 174 Element, conjugate, 547 
Dirichlet integral, 253 in probability theory, 435 
Discontinuous potentials, 354 line, 173, 192 
Discrete group, 562 of a group, 545 
Discriminants of a quadratic form, 323 surface, 173, 192 
Dispersion, of a function, 436, 437 volume, 173 
of a statistical aggregate, 349 Ellipsoid, ovary, 181 
Displacement, electric, 152 planetary, 182 
operators for spins, 407 Ellipsoidal coordinates, 178 
Distribution law, Gauss, 504 Elliptic, cylindrical coordinates, 182 
quantum mechanical, 453 function, 59, 180 
Distribution of probability, 436 integral, 180 
Divergence, 151 Emde, 98, 120, 254 
in tensor notation, 196 Emission, radioactive, 439 
theorem of, 159 ` Empirical formula, 516 
Divided differences, 470 error in, 518 
Divisor, normal, 548 . Energy, internal, 11, 451 
Doetsch, 268 kinetic, 283 
Dot product, 141 potential, 176 
Double volume integrals, 382 shell ensemble, 446 
two-center problem, 424 Ensemble, 444 
Dummy index, 162 canonical, 446 
Duncan, 282, 332, 498, 499, 503 Gibbsian, 442 
Dushman, 429 microcanonical, 446 
Dwyer, 519 Enthalpy, 13 
_ Dyadic, 163 Entropy, 12, 451, 464 
Dynamics, 171 Envelope, 48 
Epstein, 1, 186 
Eckart, 291, 358, 413, 565, 566 Equation, homogeneous, 313 
Eddington, 171 inhomogeneous, 313 
Higendifferentials, 342 integral, 520 
Ejigenfunction, 246 linear, 313 
completeness of, 276 of a matrix, characteristic, 319 
of integral equations, 527 of continuity, 152 
Eigenstates, simultaneous, 348 ‘of state, 7 , 
Eigenvalue, 246 of Sturm-Liouville, 534 


degenerate, 273 solution of integral, 521 
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Equations, numerical solution of, dif- 
ferential, 482, 489, 490 
simultaneous, 493, 497 
transcendental, 491 
Equations of motion, 
284 
Lagrange’s, 283 
Newton’s, 282 
Equilibrium, heterogeneous, 26 
thermodynamic, 14 
Equivalence of linear operators and 
matrices, 374 
Equivalent matrices, 316 
Ergodic hypothesis, 443 
Error, average, 510 
determinant, 504 
function, 505 
in empirical formulas, 518 
of a function, probable, 515. 
probability of, 505 
probable, 510 
random, 504 
root mean square, 510 
Essential observability, criterion of, 
334 
Euler, angles, 282, 286, 368, 566 
definition of gamma function, 94 
equation, 200, 270 
integral, 97 
Maclaurin formula, 474 
Mascheroni constant, 96, 427 
method for differential equations, 
485 
Euler-Rodrigues parameters, 287 
Even and odd functions, 103, 583 
Exact differential, 1, 8, 27, 156 
Exact differential, equation, 41 
Exchange, degeneracy, 415 
integral, 387, 428 
forces, 387 
Exclusion principle, 411, 415 
Expansion, adiabatic, 36 
coefficients, 374, 
Expected mean, in Gibbs phase space, 
445 
of a sequence of observation, 342 
Explicit function, 2 
Exponential integral, 427 
Extensive variable, 1 
Extremal, 200 
Eyring, 430, 576, 585 


Hamilton’s, 
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Factor group, 549 
Faltung, 261, 262 
Feller, 245 
Fermi, 491 
Fermi-Dirac statistics, 455 ff., 462 
Feshbach, 171, 176, 215, 226, 281, 342, 
521, 526, 543 
Field, scalar, 149 
vectcr, 149 
Field strength, 161 
Figures, significant, 467 
Findlay, 26 
Finkelnburg, 430 


| First order, perturbation, 389 


simultaneous differential equations, 
numerical solution of, 489 
Floquet’s theorem, 80 
Flow of fluid, 151 
heat, 34, 160 
particles, 399 
Fligge, 430 
Fluid, flow, 151 
incompressible, 224 
Flux, density, 152 
electrical, 151 
thermal, 151 
Forbidden transition, 402 
Force, 142 
acting on particle, 2&2 
moment of, 144 
Form, bilinear, 317 
discriminants of a quadratic, 323 
Hermitian, 329 
positive definite, 317 
quadratic, 317 
semi-definite, 317 
Formula, Bessel’s interpolation, 470 
central difference, 470 
empirical, 516 
interpolation, 468, 469 
Lagrange’s interpolation, 470 
Stirling’s interpolation, 470 
Forsythe, 67, 75, 186, 215 
Foster, 252 
Fourier, 525 
Fourier analysis, 247 
of odd and even functions, 253 
Fourier-Bessel, expansion, 257 
integral, 256 
transformation, 235, 240 
transforms, 254, 256 
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Fourier integral, 252 
series, 111 
theorem, 253 
Fourier transformation, 239, 259, 263 
Fourier transforms, 252 
Fowler, 452, 459, 465 
Frank, 244, 267 
Frazer, 282, 332, 498, 499, 502, 503 
Fredholm’s integral equation, 520 
method of solution, 526 
Free energy, Gibbs, 14 
Helmholtz, 18, 451, 464 
Free particle, 282, 396 
Freedom, degrees of, 282 
Frenkel, 429 
Frequency, relative, 435 
Frequency condition, Bohr’s, 402 
Fuchs’ theorem, 69 
Full linear group (FLG), 562 
Function, elliptic, 180 
even, 108, 583 
implicit, 6 
odd, 583 
potential, 283 
probable error of, 515 
scalar point, 149, 174 
Functional determinants, 17 


Gamma function, 75, 98 
space, 448 

Gas space, 443 

Gauss, differential equation, 72 
error function, 397, 437, 439, 504 
method, for numerical integration, 
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theorem, 159 

General solution, 32 

Generalized coordinates, 283, 443 
momentum, 283 

Generating functions, 133 

Geodesiecs, 200 

Gibbs, 1, 13, 24, 141, 163, 193, 442, 443 
ensembles, 442 

Gibbs-Wilson, 171 

Glasstone, 31, 430, 485, 478, 492 

Goldstein, 165, 282, 287 

Gombas, 491 

Goranson, 17 

Gordy, 300 

Goudsmit, 403 + 

Gradient, 150 
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Gradient, in tensor notation, 195, 196 
Graeffe’s method, for roots of a poly- 
nomial, 494 
solution of secular determinants, 500 
Gram determinant, 134, 312 
Gray, 120, 136 
Green’s formula, 536 
Green’s function, 534 
examples of, 539 
modified, 537 
Green's theorems, 161, 242 
Gregory's formula, 476 
Group, Abelian, 546 
alternating, 558, 561 
continuous, 562 
erystallographic, 574 
cubic, 575 
cyclic, 546, 557 
definition of, 545 
dihedral, 572, 573 
discrete, 562 
factor, 559 
full linear, 562 
full, real orthogonal, 569 
mixed continuous, 562 
octahedral, 575 
point, 574 
quotient, 559 
rotary reflection, 570 
rotation, 565 
special unitary (SUG), 568, 567 
symmetric, 558, 561 
tetrahedral, 575 
unimodular unitary, 563 
unitary, 562 
velocity, 3898 
Group character, tables of, 579 
Group theory, applications of, 581 
Growth, organic, 33 
Guggenheim, 31 


Hamel, 543 
Hamilton's canonical equations, 284, 
443 
principle, 204 
Hamiltonian function, 284 
operator, 218, 341 
Hamiltonian function, operator, time 
dependent, 396 
quantum mechanical, 299 
Hankel function, 76, 118, 525 
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Harmonic function, 220 
motion, 50 

Heat capacity, 11, 12 
conduction, differential equation of, 

237 
content, 13 
flow, 34, 160 
linear, 238 
two and three dimensions, 240 

Heine’s formula, 109, 110 

Heisenberg, 420, 429 
matrix theory, 371 
uncertainty principle, 348 

Heitler-London method, 424 

Helium atom, normal state of, 380 
excited states, 418 
ionized, 365 

Hellinger, 543 

Helmholtz’ equation, 39, 42 
free energy, 451 
function, 69 , 

Hermite differential equation, 76, 121 
functions, 121, 124, 359 
polynomial, 76, 121 

Hermitian form, 329 
conjugate, 310 
matrix, 310, 329 
operator, 269, 344, 374 
scalar product, 329 
vector space, 328 

Herzberg, 296, 297, 299, 300, 496, 576, 
2 585 

Heterogeneous equilibrium, 26 

Hicks, 500 

High eigenvalues, distribution of, 274 

Hilbert, 267, 270, 277, 279, 281, 528, 543 

Hilbert-Schmidt method for integral 

equations, 528 

Hobson, 136, 172, 191, 480 

Homogeneous, meaning of term, 45 

Homogeneous equations, 313 
gas reactions, 36 
integral equation, 527 
polynomial, 317 

Homo-polar binding, 424 

Horn, 543 

Horner’s method for roots of a poly- 

nomial, 494 

Householder, 519 

Houston, 429 

Howard, 299 
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Hughes, 413 . 
Hydrodynamies, 171, 180, 190 
Hydrogen atom, 129, 131 
quantum mechanical treatment, 363 
Hydrogen molecular ion, 385 
Hydrogen molecule, 424 
Hyperbolic functions, 187 
Hypergeometric equation, 72, 370 
series, 72 


Ideal gas, 24 
ensemble for, 442 
Image, electrical, 228 
Implicit function, 6 
Improper rotation, 325, 569 
Ince, 88 
Independent systems, quantum me- 
chanics of, 414 
Index, dummy, 162 
of subgroup, 548 
precision, 441, 505 


umbral, 162 
Indicial equation, 61 
Indistinguishable objects, arrange- 


ments of, 432 
Inductance, 42 
Inertia, moment of, 286 
product of, 286 
Infinite potential hole, 352 
Inhomogeneous differential equation, 
242 
equations, 313 
integral equation, 526 
Inner product of vectors, 141 
tensors, 166 
Integral, elliptic, 180 
line, 155 
surface, 156 
volume, 156 
Integral and differential equations, re- 
lation of, 532 
Integral equation, Abel’s, 541 
definition of, 520 
eigenfunctions of, 527 
Fredholm's, 520 
homogeneous, 527 
inhomogeneous, 526 
kernel of, 520 
linear, 520 
non-linear, 520 
of the third kind, 520 


INDEX 


Integral equation, resolvent of, 522 
Schmidt-Hilbert method for, 528 
solution of, 521 

Integral equation, summary of methods 

of solution, 532 
use of, 532 
Volterra’s, 520 

Integrating denominator, 28, 29, 84 
factor, 41 

Integration, 261 
numerical, 473 
vector, 154 

Intensive variable, 1 

Internal energy, 11, 451 

Interpolation, inverse, 471 
two-way, 471 

Interpolation for equal values of the 

argument, 467 
unequal values of the argument, 470 

Interpolation formula, 468 
Begsel’s, 470 
differentiation, 472 
Lagrange’s, 470 
Newton’s, 469 
Stirling’s, 470 

Invariance of wave equation, 582 

Invariant, 163 
subgroup, 548 

Inverse of a group element, 545 
interpolation, 471 

Inversion, 569 

Trreducible representations and eigen- 

values, 582 

Isomorphism, 549 

Tsoperimetric problems, 209 

Isothermal process, 12 

Isotope effect, 413 

Iterated kernels, 522 

Iteration method for algebraic equa- 

tions, 493 
differential equations, 484 
solution of secular determinant, 503 


Jacobi polynomial, 74, 370 
Jacobians, 17, 18 

elliptic functions, 180 
Jaeger, 266, 267 
Jahnke, 98, 120, 254 
James, 424, 503 
Jeans, 190 
Jeffreys, B. S., 281, 498 
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Jeffreys, H., 267, 281, 498, 519 
Jensen, 352 

Jordan, 328, 429 
Joule-Thomson coefficient, 23 


Kamke, 88, 120 

Kellogg, 180 

Kemble, 279, 429 

Kepler’s law, 206 

Kernel, eigenvalues of, 527 
infinite, 524 
iterated, 522 
of an integral equation, 520 
symmetric, 528 

Kimball, 480, 576, 585 

Kinetic energy, 283 

Klein and Cayley parameters, 566 

Klotz, 31 

Kneser, 215, 544 

Kohn, 503 

Kolmogorov, 519 

Kowalewski, 3382, 543, 544, 686 

Kron, 171 

Kronecker delta, 105, 124, 163, 308, 341 

Kurosh, 586 

Kutta~Runge method for differential 

equations, 486 


Lagrange’s equations of motion, 283 
generalized coordinates, 283 
interpolation formula, 470 
method of undetermined multipliers, 

210 

Lagrangian equations, 206 
function, 205, 283 
multipliers, 209 

Laguerre differential equation, 77 
function, 126 

associated, 78, 129, 366 
polynomial, 77, 126, 132 
associated, 78, 128, 182 

Lanezos, 215 

Landé, 27, 429 

Language, classical, 335 

Laplace, 525 
pairs, 266 

Laplace's equation, 161, 208, 237 
applications, 217, 224, 226 
in 2 dimensions, 218 
in 3 dimensions, 220 

Laplace Transformation, 259, 260 
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Laplacian, 153 
in cylindrical coordinates, 191 
in spherical polar coordinates, 191 
in tensor notation, 197 
Lass, 171 
Latent heat of change of pressure, 11 
Laurent series, 92 
theorem of residues, 92 
Laurent theorem 
residues, 91, 92 
Law, Gauss distribution, 504 
Newton's, 191, 282 
Least action, principle of, 215 
Least squares, principle of, 506 
Ledermann, 586 
Legendre coefficient, 66 
differential equation, 61, 98, 540 
functions, associated, 234 
polynomials, 66, 98, 104, 105, 132, 
227, 270 
polynomial, roots of, 480 
Lehrman, 17 
Leibnitz, 202 
Lense, 216 
Lerman, 17 
Levy, 483 
Lindsay, 207, 273, 431, 439, 443, 466 
Line, coordinate, 172, 192 
element, 173, 192 
of force, 45 
integral, 1, 8, 155 
Linear dependence, 182 
equation, 313 
equations, numerical solution of 
simultaneous, 497 i 
independence of vectors, 311 
integral equation; 520 
momentum in quantum mechanics, 
388 
substitution operators, 260, 406 
transformation, 306, 314, 315 
variation functions, method of, 383 
vector space, 311 
velocity, 145, 285 
Liouville-Neumann series, 521 
Sturm equation, 534 
theorem, 444. 
Lithium 
doubly ionized, 365 
Littlewood, 552, 554, 586 
London, 424 
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Longitude, 177 
Lovitt, 522, 537, 544 


Macdougall, 1 
MacDuffee, 332 
Maclaurin expansion, 99, 446 
Maclaurin, Formula of Euler and, 474 
Macmillan, 180 
MacRobert, 136 
Magnetic field, 408 
Magnetic moment of electron, 403 
Magnus, 136, 177, 252, 267 
Many-body problem, 411 
Margenau, 119, 207, 273, 352 
Marschall, 430 
Mason, 180 
Massey, 430 
Mathews, 120, 186 
Mathieu, 296, 299 
Mathieu’s differential equation, 78 
Matrices, addition of, 306 
conformable, 306 
direct product of, 307 
equivalent, 316 
multiplication of, 306 
subtraction of, 306 
Matrix, associate, 310 
adjoint, 309 
characteristic, 318 
characteristic roots of, 319 
definition of, 305 
diagonal, 308, 371, 391 
diagonalization of, 319, 331 
eigenvalues of, 319 
eigenvectors of, 319 
Hermitian, 310, 329, 371 
mechanies, 371 
method of solution for secular deter- 
minants, 502 
non-singular, 562 
null, 307 
orthogonal, 310 
partition of, 307 
rank of, 306 
reciprocal, 309 
rectangular, 306 
singular, 305 
symmetric, 310 
symmetric and skew symmetric, 310 
trace of, 308 _ 
transform of, 317 
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Matrix, transposed, 309 
unit, 308 
unitary, 310, 330 
Maxima in a tabulated function, 472 
Maximum area, 210 
volume, 211 
Maxwell, 15, 185, 190 
Maxwell-Boltzmann, distribution law, 
449, 456 
Maxwell’s relations, 15 
Mayer, J. E., 431, 466 
Mayer, M. G., 342, 431, 466 
McConnell, 171 
McLachlan, 136 
Mean in phase space, 445 
of a function, 436 
of aggregate of measurements, 336 
Measure of precision, 441 
Measurements, rejection of, 516 
weight of, 514 
Mechanical work, 142 
Mechanics, 180, 282 
statistical, 431 
Meister, 585 
Mellin, 526 
Mellin transformation, 259 
Mellor, 494 
Membrane, vibrating circular, 254 
Method, Gauss’, 479 
Method of averages for empirical 
formulas, 516 
least squares, 517 
iteration for algebraic equations, 493 
Newton-Raphson for algebraic equa- 
tions, 492 
“regula falsi” for algebraic equa- 
tions, 491 
Microcanonical ensemble, 446 
Microscopic state, 453 
Milne, E. A. 171 
Milne, W. E., 498, 499, 519 
Milne-Thomson, 180, 190 
Milne’s method for differential equa- 
tions, 489 
Minima in a tabulated function, 472 
Minimum value of integral, 198 
surface of revolution, 203 
Minor of determinant, 303 
Mixed-continuous group, 562 
Mixed tensor, 163 
Möbius strip, 156 
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Molecular spectroscopy, 585 
Molecule, diatomic, 582 
motion of, 290 
polyatomic, 2&2 
potential energy of a, 295 
quantum mechanical Hamiltonian 
of a, 299 
rotational motion of a, 292 
space, 443 
translational motion of a, 291 
vibrational energy of a, 204 
vibrational frequency of a, 296 
vibrational motion of a, 291 
Moment of a force, 145 
of aggregates of measurements, 346 
of a probability distribution, 437 
of inertia, 286 
of momentum, 285 
Moment theorem, 437 
Momentum, angular, 235 
generalized, 283 
moment of, 285 
Morse, 171, 176, 215, 256, 281, 342, 
429, 521, 526, 548 
Motion, Hamilton’s equations of, 284 
Lagrange’s equations of, 283 
Newton's laws of, 191, 282 
of a molecule, 290 
Mott, 429, 430 
Muir, 302, 332 
Multiplication, 262 
multipole expansion, 100 
multipole moment, 101 
multipole strength, 101 
Multiplication of determinants, 304 
matrices, 306 
Multiplication of determinants, ten- 
sors, 165 
Multiplication table of a group, 546 
Murnaghan, 266, 548, 555, 564, 586 
Murphy, 574 
Murray, 585 
Muskhelishvilt, 544 
Mu-space,’ 443 


NBS Mathematical Tables Project, 136 
Nebengruppe, 548 
Negative kinetic energy, 358 
Neumann function, 76 

Liouville series, 521 
Neutrons, 418 
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Newton’s binomial expansion, 433 
equations of motion, 191, 282 
interpolation formula, 469 
probability distribution, 438 

Newton-Cotes formula, 476 

Newton-Raphson, method for algebraic 

equations, 492 
roots of a polynomial, 494 

Nielsen, H. H., 300 

Nielsen, N., 120, 136 

Non-orthogonal coordinate systems, 

192 

Non-singular matrices, 562 

Normal coordinates, 292, 326 
divisor, 548 
mode of vibration, 296 

Normalization of functions, 249 

Nucleus, atomic, 282 
of an integral equation, 520 

Numbers, Bernoulli, 474 

Numerical determination of roots of 

polynomial, 494 
differentiation, 472 
evaluation of determinants, 499 
integration, 473 
secular determinants, 500 
simultaneous equations, 493-497 
solution of differential equations, 
482 . 
transcendental equations, 491 


Oberhettinger, 186, 177, 252, 267 
Oblate spheroidal coordinates, 182 
Observability, essential, 334 
Observable, 335, 338 
Occupation numbers, 453 
Odd function, 103 
Operand, 337 
Operations composing crystallographic 
groups, 574 
Operator, 48, 336, 338 
Operator, commuting, 348 
equation, 337 
Hermitian, 269, 374 
in tensor notation, 195 
vector, 174 
Orbital, 422 
Order of a differential equation, 32 
group, 546 
group element, 546 
Ordinary differential equations, 32 
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Orthogonal coordinate system, 173 
matrix, 310 | 
transformation, 317, 324 
Orthogonality of functions, 248 
quantum mechanical eigenfunctions, 
844 
Orthogonalization of vectors, 312 
Orthohelium, 423 
Orthonormal functions, 249 
Oscillation, forced, 54 
natural, 52 
Oscillator, anharmonic, 59 
by matrix mechanics, 371 
harmonie, 125, 207, 300 
quantum mechanical treatment, 358 
Outer product of vectors, 148 
tensors, 165 


Page, 56, 140, 210 
Parabolic coordinates, 185, 186 
Paraboloidal coordinates, 184 
Parameters, Cayley-Klein, 287 
Rodrigues, 287 
Parhelium, 423 
Parington, 31, 171 
Partial differentiation, 1 
differential equation, 216 
Particle, concept of, 334 
free, 282 
vs. wave, 335 
Particles, system of n free, 282 
restricted, 282 
Particular integral, 53 
solution, 33 
Partition function, 452, 465 
of a permutation, 559 
Paul, 31 
Pauli, 417 
principle, 411 
spin theory, 402 
Pauling, 177, 367, 382, 387, 429, 430 
Peirce, 59 
Periodicity as boundary condition, 273 
Perlis, 332 
Permutations, 431, 549 
even, 417 
odd, 417 
Perturbation theory, 387 
Pfaff differential equation, 28, 82 
Phase, 25 
integral, 452 
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Phase, rule, 25 
space, 442 
velocity, 228, 398 
Phillips, 141, 586 
Photon, 418 
Physical system, 335 
Picard method for differential equa- 
tions, 484 
Planck's constant, 338 
Plane, potential due to charged, 225 - 
Plummer, 507, 513 
Point function, 8 
scalar, 149, 174 
Poisson’s equation, 237, 242 
forinula, 441 
Polar coordinates, spherical, 177, 191 
vector, 165 
Polarizability, atomic, 392 
Polarization, electric, 56, 161 
Polyatomic molecule, 282 
Polygon, rotation of, 572 
Polynomial, complex roots of, 495 
homogeneous, 317 
method, differentiation by, 473 
for solution of secular deter- 
minants, 500 
numerical determination of roots, 
494 
roots of the Legendre, 480 
Postulates of quantum mechanics, 337 
Potential, chemical, 25 
electrostatic, 217, 224 
energy, 176, 283, 295 
theory, 180, 191 
thermodynamic, 14 
velocity, 217, 224 
Potential due to conducting sphere, 224 
charged cylinder, 225 
charged plane, 225 
Precision index, 441, 505 
measures of, 441, 510, 513 
Prigogine, 31 
Principal axis transformation, 326 
Principle of least squares, 506 
Probability, 435 
Probability, aggregate, 435 
amplitude, 347, 402 
density, 436 
of phase, 444 
theory, 435 
Probability distributions, 436 
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Probability distributions, arithmetical, 
436 
continuous, 436 
discrete, 436 
geometrical, 436 
Probable errors, 510 
of a function, 515 
Product, Hermitian scalar, 329 
of inertia, 286 
Product of tensors 
inner, 166 
~ outer, 165 
Product of vectors 
cross, 143 
dot, 141 
inner, 141 
outer, 143 
scalar, 141, 811 
scalar, triple, 146 
skew, 143 
three vectors, 146 
triple vector, 147 
vector, 142 
Projectile, 40 
Prolate spheroidal coordinates, 180 
Proper rotation, 325 
Property in probability theory, 435 
Protons, 418 


Quadratic form, 317 
discriminants of, 323 
Quadrature, approximate, 474 
formulas, general remarks, 481 
Quadrupole moment, 101 
Quadrupole, potential due to, 228 
Quantum dynamics, 337 
mechanics, general discussion, 
333 
number, total, 366 
statics, 337 
Quotient group, 549 


Radiation theory, 400 

Radioactive decay, 33, 43 
emission, 439 

Radius vector, 153 

Rainville, 88 

Raman effect, 585 

Random walk, 439, 442 

Randomness, criterion of, 435 

Rank of tensor, 162 
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Raphson-Newton method for algebraic 
equations, 492 
roots of a polynomial, 494 
Rate of solution, 35 
Rayleigh-Schrédinger perturbation 
formula, 389 
Reaction, bimolecular, 36 
consecutive, 40 
homogeneous, 36 
opposing, 40 
order of, 37 
rate, 37 
termolecular, 36, 40 
unimolecular, 36 
Real eigenvalues, 345 
Reciprocal matrix, 309 
parallelepiped, 352 
vectors, 198 
Recurrence formula, 72 
Reduced mass, 413 
Reducible representation, 550 
Reduction of group representations, 
552 
Reed, 24, 483 
Reflection coefficient of potential bar- 
rier, 356 
rotary, 325 
Regular points of a differential equa- 
tion, 71 
Relative coordinates, 411 
frequencies of measured values, 346 
frequency, 435 
velocity, 289 
Relativity, theory of, 171 
Representation, associated, 560 
of groups, 550 
irreducible, 550 
orthogonality of, 551 
reducible, 550 
self-associated, 560 
Representative point, 442 
Residuals and precision measures, 513 
Residue, 89 
Residues, theorem of, 89 
Resistance, 42 
Resolvent of an integral equation, 522 
Resonance catastrophe, 55 
Reversion of series, 471 
Rigid body, definition of, 284 
most general motion of, 285 
rotation of, 285 
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Rigid body, translation of, 285 

Ritz method, 377, 379 

Robertson, 176 

Robinson, 495, 497 

Rodrigues’ formula, 102 
parameters, 287 


_ Rojanski, 429 


Root mean square error, 510 
Roots of Legendre polynomial, 480 
matrix, characteristic, 319 
polynomial, numerical determination 
of, 494, 495 
secular determinant, 500 
Rope, suspended, 543 
Rosenthal, 574 
Rossini, 31 
Rotary reflection group, 570 
Rotation, 171 
axis of, 285 
of a rigid body, 285 
vector, 152 
group 
three-dimensional, 565 
two-dimensional, 570 
improper, 324, 325, 569 
proper, 324, 325 
Rotations as groups, 566 
Rotator 
quantum mechanical treatment, 360 
rigid, 300 
Ruark, 429 
Rule, Simpson’s, 477 
trapezoidal, 477 
Weddle’s, 478 
Runge-Kutta method for differential 
equations, 486 
Rushbrooke, 466 
Rutherford, 171 
Rutledge, 473 


Saddle point, 460 
Sayvetz, 291 
Scalar, 137, 163 
field, 149 
gradient of, 150 
point function, 149, 174 
Scalar product, 141, 311 
Hermitian, 329 
triple, 146 
Scarborough, 471, 495, 519 
Schiff, 429 
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Schlaefli’s formula, 102, 103, 105 
Schmidt, 71 - 
Schmidt, orthogonalization method for 
functions, 278 
vectors, 312 
Schmidt-Hilbert method for integral 
equations, 528 
Schoenfltes, 574, 586 
system of group notation, 576 
Schreier, 332 
Schrödinger, 186, 429, 466 
Schrédinger equation, 176, 213, 341 
and group theory, 582 
involving time, 393, 394 
of free mass point, 350 
Schwank, 521 
Schwarz’ inequality, 134, 348 
Second order differential equations, 48; 
numerical solution of, 490 
Second order perturbation, 389 
Secular determinant, 421, 500 
Seitz, 81, 430, 578 
Self-adjoint differential equation, 268 
operator, 267 
Self-associated representation, 560 
Separation of center-of-mass coordi- 
nates, 411 
of variables, method of, 176, 218, 
220, 231 
Series, integration, 59 
Liouville-Neumann, 521 
method for differential equations, 
483 
reversion of, 471 
Shaw, 18, 519 
Shearing strain, 170 
Sherwood, 24, 483 
Shortley, 422, 429 
Significant figures, 467 
Similarity transformation, 317, 318 
Simpson’s rule, 477 
Simultaneous differential equations, 
numerical solution of, 489 
elgenstates, 348 
equations, numerical solution of, 493, 
497 
Singlet states, 423 
Single-valuedness, 336 
Singular point of a differential equa- 
tion, 70 
solution, 33, 47 
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Singularity, essential, 71 
Singularity, non-essential, 71 
Skew product, 148 
symmetric matrix, 310 
symmetric tensor, 164 
Slater, 17, 429 
Smith, 300 
Snapshot, 334 
Sneddon, 267, 429 
Soap film, 39 
Sokolnikof, E. S., 59 
Sokointkoff, I. S., 59 
Solution, rate of, 35 
singular, 47 
Solution of differential 
numerical, 482 
of integral equations, 521, 532 
of simultaneous equations, numeri- 
cal, 493, 497 
of transcendental equations, numer- 
ical, 491 
Sommerfeld, 119, 245, 353, 398, 429 
Space, Hermitian vector, 328 
linear vector, 310 
Spain, 171 
Special funetions 
Cauchy’s theorem, 90 
Cauechy-Riemann, 90 
Spectroscopy, molecular, 585 
Speiser, 552, 554, 555 
Sphere, moving through incompres- 
sible fluid, 224 
oscillating, 235 
Spherical harmonie, 235, 362 
polar coordinates, 177, 191 
Sperner, 382 
Spheroidal coordinates, oblate, 182 
prolate, 180 
Spin angular momentum, 403 
coordinate, 408 
degeneracy, 409 . 
displacement operators, 407 
energy, 408 
function, 405 
in two-body problem, 422 
matrices, 405, 566 
operator, 405 
vector, 405 
Spinning electron, 402 
Spread of measurements, 349 
Standard deviation, 349, 437 


equations, 
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Stark effect, 186, 391 
State 
quantum mechanical, 316 
time-dependent, 393 
function, 336 
intuitive meaning of, 343, 347 
Stationarity condition, 200 
Stationary path, 198 
states, 337 
Statistical mechanics, 431 
Steepest descents, method of, 459 
Siehle, 282 
Steiner, 1, 31 
Step function, 239 
Stirling’s formula, 436 
interpolation formula, 470 
theorem, 97 
Stokes’ theorem, 157 
Strain, 161, 169, 170 
Stratton, 119, 256 
Strength, field, 161 
Stress, 161 
String, homogeneous vibrating, 542 
Sturm-Liouville equation, 534, 538 
theory, 267, 280, 342 
Subgroup, 546 
conjugate, 548 
invariant, 548 
Sublimation, 38 
Subtraction of matrices, 306 
Sum of state, 452 
of tensors, 164 
of vectors, 141 
Surface, coordinate, 172 
element, 173, 192 
integral, 156 
tension, 39 
Suspension bridge, 57 
Symbol, Christoffel three-index, 167, 
196 
Symmetric eigenfunctions, 416 
group, 558 
kernel, 528 
matrix, 310 
state funetion, 455 
tensor, 164 
top, 368 
System, conservative, 283 
thermodynamic, 442 


Tallquist, 136 
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Tamarkin, 245 
Taylor, 465, 478, 492 
Taylor series, 102 
Taylor series method for differential 
equations, 483 
Temperature, 29 
Tension strain, 170 
Tensor, associated, 167 
component of, 162 
contraction of, 166 
contravariant, 163 
covariant, 163 
covariant derivative of, 169 
differentiation of, 167 
first rank, 162 
length of, 166 
mixed, 163 
product, inner, 166 
product, 165 
skew-symmetric, 164 
symmetric, 164 
Tensor notation, differential operators 
in, 195 
divergence in, 196 
gradient in, 196 
Laplacian in, 197 
curl in, 197 
Tensors in curvilinear coordinates, 192 
difference of, 164 
perpendicular, 166 
sum of, 164 
Ter Haar, 466 
Theorem, Gauss’, 159 
Green’s, 161 
of divergence, 159 
of residues, 457 
Stokes’, 157 
Thermal conductivity, 35 
flux, 151 
Thermodynamic derivatives, 15 
potential, Gibbs, 14 
relations, 450 
system, 442 
variables, 1 
laws of, 11 
second law of, 13 
Thomas, L. H., 491 
Thomas, T. Y., 171 
Thomson, 207 
Three-index symbol, Christoffel, 167 
Time-dependent states, 373 


INDEX 


Titchmarsh, 252, 267 
Tobolsky, 17 
Toeplitz, 543 
Tolman, 481, 443, 466 
Top spherical, 371 
symmetrical, 368 
Toroidal coordinates, 190 
Torrance, 59, 156 
Total differentials, 3, 8 
Trace, 554 
of matrix, 308 
Trambarulo, 300 
Transcendental equations, numerical 
solution of, 491 
Transformation, collineatory, 317 
congruent, 317, 322 
conjunctive, 317 
linear, 314 ` 
orthogonal, 324 
principal axis, 326 
real orthogonal, 317 
similarity, 317, 318 
unitary, 317, 564 
Transform of a group element, 547 
of a matrix, 317 
in solving differential equation, 263 
Transients, 43 
Transition probability, 402 
forbidden, 402 
Translation, 171 
Translation of a molecule, 291 
of a rigid body, 285 
Transmission coefficient of barrier, 357 
Transparency factor of a potential 
barrier, 358 
Transpqsed matrix, 308 
Transposition, 558 
Trapezoidal rule, 477 
Triple product, scalar, 146 
vector, 147 
Triplet states, 423 
Tschebyscheff polynomial, 74, 132 
differential equation, 73 
Tunell, 12 
Tunnel effect, 356 
Turnbull, 316, 322 
Two-body problem in quantum me- 
chanics, 413 
Two-sided transformation, 259 


Uhlenbeck, electron. apin, 403 
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| Umbral index, 162 


Uncertainty in angular momentum, 350 
Uncertainty principle, 348, 394 
Unimodular unitary group, 563 
Unitary group, 562 
matrix, 310, 380 
transformation, 317, 564 
Unit element in group theory, 545 
matrix, 308 
vectors, 140, 174 
Urey, 429 
Uspensky, 519 


Valence bond coordinates, 300 
Value of a physical quantity, most 
probable, 504 
true, 504 
Van der Waals’ equation, 5, 24 
Van der Waerden, 585, 586 
Van Vleck, 392. 429 
Variable, canonically conjugate, 284 
cogredient, 318 
contragredient, 317 
extensive, 1 
independent, 32 
intensive, 1 
thermodynamic, 1 
Variables, method of separation of, 176 
Variation, 199 
Variation theory of eigenvalue prob- 
lems, 270 
Variational method, 377 
Variations, calculus of, 198 
Vector area, 144 
axial, 165 
column, 305 
components of, 137 
contravariant, 162 
covariant, 163 
curl of, 152 
curvilinear component of, 174 
differentiation of, 148 
divergence of, 151 
field, 149 
integration, 154 
irrotational, 154 
length of, 137 
magnitude of, 137 
operator, 174 
origin of, 137 
polar, 164 
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Vector area, product, 142 
pseudovector, 165 
radius, 153 
row, 305 . 
solenoidal, 154 
space, Hermitian, 328 
linear, 310 
sum, i41 
terminus of, 137 
triple product, 147 
unit, 140, 174 
Vectors, base, 193 
difference of, 141 
linear independence of, 311, 312 
orthogonalization of, 312 
products of three, 146 
reciprocal, 193 
scalar product of, 141, 311 
Velocity, absolute, 289 
angular, 145, 285 
linear, 145, 285 
of following, 290 
potential, 217, 224 
relative, 289 
Vena contracta, 34 
Vibrating sphere, with node at surface, 
258 
string, 208, 247 
Vibration problems, 542 
Vibrational energy of a molecule, 
294 
frequency of a molecule, 296 
Vibrations, forces, 542 
normal mode of, 296 
of a molecule, 290 
Vivanti-Schwank, 544 
v. Karman, 270 
v, Mises, 216, 244, 267 
v. Neumann, 429 
. Volume element, 173 
integrals, 156 
Volterra’s integral equation, 520 


Wade, 171, 332 
Walter, 430, 576, 585 
Watson, 80, 96, 136, 180, 543 
Wattles current, 56 
Wave equation, 212, 228 
Schrédinger, 176 
space form of, 230, 231 
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Wave length, 230, 247 
Wave length, number, 230 
packets, 396 
Waves 
in one dimension, 231 
in two dimensions, 231 
in three dimensions, 232 
Waves, monochromatic, 235 
plane, 229 
spherical, 230 
standing, 230 
va. particles, 335 
Weatherburn, 171 
Weaver, 180 
Webster, 245 
Weddle’s rule, 478 
Weierstrass, definition of gamma func- 
tion, 95 
p-tunction, 180 
Weight of measurement, 514 
Wentzel, 215 
Weyl, 330, 582, 586 
Wheatstone bridge, 314 
Whittaker, 80, 96, 180, 282, 287, 326, 
495, 497, 519, 543, 566 
Widder, 267 
Wiener, 267 
Wigner, 287, 328, 560, 562, 564 ff., 582, 
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Willers, 519 
Wilson, E. B., 141, 163, 193 
Wilson, E. B., Jr., 177, 299, 220, 367, 
382, 387, 429, 519 
Wintner, 305 
Wolfsohn, 119 
Work content, 13 
in thermodynamics, 10 
mechanical, 142 
Wronskian, 33, 134 
Wu, 296 


Yoe, 504 
Youden, 519 


Zachariasen, 586 
Zassenhaus, 586 

Zeeman effect, normal, 392 
Zemansky, 1 

Zernike, 437 

Zonal harmonic, 66 


TABLE A1 CONVERSIONS, CONSTANTS, AND FORMULAS 


- Volume and Weight 


1 U. S. gallon=8.34 Ibs x Sp Gr 

1 U. S. gallon=0.84 Imperial gallon 

1 cu ft of liquid=7.48 gal 

I cu ft of liquid=62.32 Ibs x Sp Gr 

Specific gravity of sea .water= 1.025 
to 1,03 

1 cu meter= 264.5 gal 

1 barrel (oil)=42 gal 


Capacity and Velocity 
1 gom=449 cu ft per sec 


gpm= Ibs_per hour 
500x Sp. Gr. 


gpm=0.069 x boiler Hp 
gpm=0.7 x bbl/hour=0.0292 
bbi/day 
gpm==0.227 metric tons per hour 
1 mgd=694.5 gpm 
y= gpmx 0.321_ _ gpm x 0.409 
area in sq. in ~ D2 


V= yH 


gpm= gallons per minute 

Sp Gr=specific gravity based 
on water at 62°F 

Hp= horsepower 

bbi= barrel (oi])=42 gai 

mgd= million gallons per day 
of 24 hours 

V=velocity in ft/sec 

D=diameter in inches 

g= 32.16 ft/sec/sec 


H= head in feet 
Head 
, Head in psix 2.31 
Head in feett=——~——_______—. 


Sp Gr 
1 foot water (cold, fresh)= 1.133 
inches of mercury 
1 psi=0,0703 kilograms per sq 
centimeter 
1 Psi=0.068 atmosphere 


H= v 
2g 
psi= pounds per square inch 


Power and Torque 


1 horsepower-=550 ft-lb per sec 
= 33,000 ft-lb per min 
= 2545 btu per hr 
= 745.7 watts 
=0,7457 kilowatts 
gpm x Head in feet x Sp Gr. 


bh 
P 3960 x efficiency 


_ gpmx Head in psi 
= 1714x efficiency 


Hp x 5252 


Torque in lbs feet= Fp 


bhp=brake horsepower 
rpm=revolutions per minute 


Miscellaneous Centrifugal Pump 
Formulas 


Specific speed= Ns * SPT ATEN 
where H=head per stage in feet 


Diameter of impeller in 


q- 1840 Ku ~A 


inches = ph 


where Ku is a constant varying with 
impeller type and design. Use H at 
shut-off (zero capacity) and Ku is 
approx. 1.0 


At constant speed : 
dy gpm: VH, _ vbhp 
do gpm2 ./H. Wbhpe 


At constant impeller diameter 


rpm, _gpmi _ Vi _ YBhp; 
rpmz gpmz «yiiz 4~/Bhpz 


TABLE A.2 


MEASUREMENT CONVERSION 


hety anane A EPS a nite na RE aS 


Atmospherge..................14.7 pounds (English) 
14,223 pounds (Russian 
Btu (British Thermal Unit) T18 foot pounds 
0.2930 watt hour 
0,252 calorie 
Caloria.. a I kilogram of water raise: 


I degree Centigrade 


3.97 Biu 


Centare (square meter)....10.764 square feet 


Centlmeter.............-.....0.3937 inch 


Cheval (French hp-).,...-0,986 horsepower 


Cubic Centimater 


{milliler} 0.061 cubic inch 
Cubit Foot -1,728 cubic inches 
7.48 galions 
60 pints 


8/18 bushel 


62.32 tbs water (f20F) 
1.000 ounces of water, 


approx. 


0.028 cubic meter 
28.32 liters 


Cubic treh.. 16,39 cubic centimeters 


Cuble Meter. 35385 cubic feet 
1,308 cubic yards 


Cubic Yard... cere 27 cubic feet 
0,765 cubic meter 


reene 3.937 inches 


Foot. whet anne eteee 12 inches 

1.385 meter 
Foot Pound... s.a... 0.1364 kilogrammeter 
Ballon. cee 23E cubic inches 

4 quarts 

8 pints 


3.785 tithes 
128 fluid ounces 


8,33 pounds of water 


Gallop per Minute........449 cuhic feet per second, 
0.227 metric tons per hour 


Gallon (British Imperial) 277.3 cubic inches 
1.201 U.S. gallons 


10 Ibs water at 15°C, 


4.546 liters 


GRAM. erreiaren 15,43 grains 
0.0353 ounce 
0.0022 pound 


Horsepower, 


42.41 Btu per minute 
1.014 cheval 
746 watts 


temerin 1+6.33,000 ft Ib per minute 


Pref oe t 
red ray of cadmium 
25.4 millimeters 


1 


Kilogram... paa 2.2046 pounds 
32.274 ounces 
15432.36 grajns 
0.0011 short ton 
0.00098 iong ton 


Kitogram per Cubic 
Meer avssseraese 00624 Ibs per cu ft 


Kilogram per Square 
Centimeta: 


14,225 tbs per sq in 


0.208 Ibs per sq fe 


1,000 meters 
0,621 muke 


Milowatt.. eee 1.34 horsepower 


44,257 ft tb per minute 


56.87 Btu per minute 


Liter ae 
1.057 quart 
0.264 gatton 
61.02 cubic inches 
1035 cubic feet 
33.8147 fluid ounces 
270.518 fiuid drams 


Litar per Second... 2 2 cu ft per minute 


Mater... Aeeseerees 39.37 inches 
3.28 Leet 
1,09 yards 


Matric Ton...... tee teeetaeeen 2204,6 pounds 
+ 1.1023 short tons 


Mit errin 10,008 inch 
25.4 microns 
0.0254 millimeter 


PAN oes cess ov] ,760 yards 
5,280 feet 
1.61 kilometers 


Ount one 437.5 grains 
0.911 troy ounces 


28.35 gram 


Ounce {Fiuid).......... seared 805 cubic inches 
29,573 milliliters 


Ounce (Fine)... Troy ounce 
480 grains 
31.104 grams 


trassen 1000027 cubic decimeted 


i 
.474 U, 5. Gal per min 


Pound Avoirdupols. 


Pound per Cubic 
Faot... 


Pound per Sq in. 


Pound per Sq Ft. 


Quert. 


‘Rquare Cantimeter..... 


Square Foot.............. 


Square inch... 


Square Kilometer.. 


Square Mater (centare) 


areare 39,540 | wave lengths of PANE. eee 0.4732 liter 


16 fluid ounces 


o 16 ounces 


7,000 grains 

454 grams 

0.454 kilogram 
44.58 tray ounces 


116,02 kilogram per 


cubic meter 


„31 feet head of water at 


1,00 sp gr 
0.0703 kilogram per sq 
centimeter 


0.946 liter 


+b cubic meter 


. 0.155 square inch 


«0,093 square meter 


144 square inches 


6.952 square centuneters 


0.386 square mile 


10.764 square feet 
1.4196 square yard 


Square Mile. 640 acres 


Square Millimat: 


Square Yard.............. 


Stone (British)... 


Ton {shori Jerse sacs 


Ton (fong)... areenaan 


3,097,600 square yards 
2.59 square kilometers 


+0,00155 square inch 


-0.836 square meter 


+14 pounds 


6.35 kilograms 


„2,000 pound 


12,240 pounds 


1,016 kilograms 
270 galtons 


Ton per Hour (meric) 4.4 gallons per minute 


Tonne (metric). 


Yar 


1,000 kilograms 
2204.62 pounds 


3 feet 


36 incbes 
0,914 meter 


